12. PRESUPUESTO
13.1. Cofinanciación
For those of you who are familiar with programming, it is nearly intuitive that when you create a matrix-variable, say a, and you wish to create its (modified) copy, say b = a + 1, b in fact will be independent of a. This is not a case within older versions of NumPy due to the
"physical" pointing at the same object in the memory. Analyse the following case study:
>>> a = np.array([1,2,3,4,5])
>>> b = a + 1
>>> a; b
array([1, 2, 3, 4, 5]) array([2, 3, 4, 5, 6])
but now, if:
>>> b[0] = 7
>>> a; b
array([7, 2, 3, 4, 5]) array([7, 3, 4, 5, 6])
we affect both 0-th elements in a and b arrays. In order to “break the link” between them, you should create a new copy of a matrix using a
.copy function:
.nonzero()
>>> a = np.array([1,2,3,4,5])
>>> b = a.copy()
>>> b = a + 1
>>> b[0] = 7
>>> a; b
array([1, 2, 3, 4, 5]) array([7, 3, 4, 5, 6])
Fortunately, in Python 3.5 with NumPy 1.10.1+ that problem ceases to exist:
>>> import numpy as np
>>> np.__version__
'1.10.1'
>>> a = np.array([1,2,3,4,5])
>>> b = a + 1
>>> a; b
array([1, 2, 3, 4, 5]) array([2, 3, 4, 5, 6])
>>> b[0] = 7
>>> a; b
array([1, 2, 3, 4, 5]) array([7, 3, 4, 5, 6])
however, keep that pitfall in mind and check for potential errors within your future projects. Just in case. ☺
3.2.5. 1D Array Flattening and Clipping
For any already existing row vector you can substitute its elements with a desired value. Have a look:
>>> a = np.array([1,2,3,4,5])
>>> a.fill(0);
>>> a
array([0, 0, 0, 0, 0])
or
>>> a = np.array([1,2,3,4,5])
>>> a.flat = -1
>>> a
array([-1, -1, -1, -1, -1])
It is so-called flattening. On the other side, clipping in its simplistic form looks like:
>>> x = np.array([1., -2., 3., 4., -5.])
>>> i = np.where(x < 0)
>>> x.flat[i] = 0
>>> x
array([ 1., 0., 3., 4., 0.])
Let’s consider an example. Working daily with financial time-series, sometimes we wish to separate, e.g. a daily return-series into two sub-series storing negative and positive returns, respectively. To do that, in NumPy we can perform the following logic by employing a .clip
function.
.copy()
.fill
.flat
np.where Returns an array with the indexes corresponding to a specified condition (see Section 3.3.3 and 3.8)
Say, the vector r holds daily returns of a stock. Then:
>>> r = np.array([0.09,-0.03,-0.04,0.07,0.00,-0.02])
>>> rneg = r.clip(-1, 0)
>>> rneg
array([ 0. , -0.03, -0.04, 0. , 0. , -0.02])
>>> rneg = rneg[rneg.nonzero()]
>>> rneg
array([-0.03, -0.04, -0.02])
Here, we end up with rneg array storing all negative daily returns.
The .clip(-1, 0) function should be be understood as: clip all values less than -1 to -1 and greater than 0 to 0. It makes sense in our case as we set a lower boundary of -1 (-100.00% daily loss) on one side and 0.00% on the other side. Since zero is usually considered as a
“positive” return therefore the application of the .nonzero function removes zeros from the rneg array.
The situation becomes a bit steeper in case of positive returns. We cannot simply type rneg=r.clip(0, 1). Why? It will replace all negative returns with zeros. Also, if r contains daily returns equal 0.00, extra zeros from clipping would introduce an undesired input. We solve this problem by replacing “true” 0.00% stock returns with an abstract number of, say, 9 i.e. 900% daily gain, and proceed further as follows:
>>> r2 = r.copy(); r2
array([ 0.09, -0.03, -0.04, 0.07, 0. , -0.02])
>>> i = np.where(r2=0.); r2[i] = 9 # alternatively r2[r2==0.] = 9
>>> rpos = r2.clip(0, 9)
>>> rpos
array([ 0.09, 0. , 0. , 0.07, 9. , 0. ])
>>> rpos = rpos[rpos.nonzero()]
>>> rpos
array([ 0.09, 0.07, 9. ])
>>> rpos[rpos == 9.] = 0.
>>> rpos
array([ 0.09, 0.07, 0. ])
If you think for a while, you will discover that in fact all the effort can be shortened down to two essential lines of code providing us with the same results:
>>> r = np.array([0.09,-0.03,-0.04,0.07,0.00,-0.02])
>>> rneg = r[r < 0] # masking
>>> rpos = r[r >= 0] # masking >>> rneg; rpos
array([-0.03, -0.04, -0.02]) array([ 0.09, 0.07, 0. ])
however by doing so you’d miss a lesson on the .clip function ☺. More on masking for arrays in Section 3.8.
As you can see, Python offers more than one method to solve the same problem. Gaining a flexibility in knowing majority of them will make you a good programmer over time.
By separating two return-series we gain a possibility of conducting an additional research on, for instance, the distribution of extreme losses
.clip
or extreme gains for a specific stock in a given time period the data come from. In the abovementioned example our return-series is too short for a complete demonstration, however in general, if we want to extract from each series two most extreme losses and two highest gains, then:
If repeated for, say, 500 stocks (daily return time-series) traded within S&P 500 index, the same method would lead us to an insight on an empirical distribution of extreme values both negative and positive that could be fitted with a Gumbel distribution and tested against GEV theory.
3.2.6. 1D Special Arrays
NumPy delivers an easy solution in a form of special arrays filled with: zeros, ones, or being “empty”. Suddenly, you stop worrying about creating an array of specified dimensions and flattening it.
Therefore, in our arsenal we have:
>>> x = np.zeros(5); x
The alternative way to derive the same results would be with an aid of the .repeat function acting on a 1-element array:
>>> x = np.array([0])
>>> x2 = np.full((1, 5), 1, dtype=np.int64) .sort()
By default this function sorts all elements of 1D array in an ascending order and alters the matrix itself
>>> x2
array([[1, 1, 1, 1, 1]])
where the shapes of the arrays, x1 and x2, have been provided within the inner round brackets: 1 row and 5 columns.
An additional special array containing numbers from 0 to N-1 we create using the arange function. Analyse the following cases:
>>> a = np.arange(11)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> a = np.arange(10) + 1
>>> a
array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> a = np.arange(0, 11, 2)
>>> a
array([ 0, 2, 4, 6, 8, 10])
>>> a = np.arange(0, 10, 2) + 1
>>> a
array([1, 3, 5, 7, 9])
and also
>>> b = np.arange(5, dtype=np.float32)
>>> b
array([ 0., 1., 2., 3., 4.], dtype=float32)
Array—List—Array
The conversion of 1D array into Python’s list one can achieve by the application of the .tolist() function:
>>> r = np.array([0.09,-0.03,-0.04,0.07,0.00,-0.02])
>>> l = r.tolist()
>>> l
[0.09, -0.03, -0.04, 0.07, 0.0, -0.02]
On the other hand, to go from a flat Python list to NumPy 1D array employ the asarray function:
>>> type(l)
<class 'list'>
>>> a = np.asarray(l)
>>> a
array([ 0.09, -0.03, -0.04, 0.07, 0. , -0.02])
>>> a.dtype dtype('float64')