I will show you NumPy functions that will help you most in your data processing and algorithms.
Nowadays, Numpy is one of the most used Python packages to manage and transform structured data. This growth is due to its continuous improvement and differentiation with standard Python lists.
What are the advantages of Numpy?
- The memory space occupied by a data vector is much smaller due to its architecture, so when you have multi-dimensional arrays that take up a lot of space, this is an important advantage.
- Numpy provides us very optimized functions which save us many steps that we do not have to program.
In short, the main advantages are its speed and efficiency.
Most used commands
Following, I will show and explain to you the commands that are most used and can help you in the world of vector computing.
First, we must import the package which is usually already installed in many environments, on the contrary, you can use pip to install it.
import numpy as np
Create a list
We are going to create a python list and a Numpy one that we can check the type of each one.
python_list = [1,2,3,4,5,6,7,8,9,10]
numpy_list = np.array(python_list)
Return
# First Type: <class 'list'>
# Second Type: <class 'numpy.ndarray'>
As you can see, in NumPy these data structures are not known as 'lists', they are 'numpy.ndarray' (n-dimensional array) similar to the arrays of other programming languages.
Arrays can also be created without cast from a Python standard list.
Sequential Values
Create an array with the first 10 values.
numpy_array_0 = np.arange(10)
Return
[0 1 2 3 4 5 6 7 8 9]
Create an array with numbers between 10 and 20 (20 not included)
numpy_array_10 = np.arange(10,20)
Return
[10 11 12 13 14 15 16 17 18 19]
Create an array with numbers between 0 and 20 but with a step of 2
numpy_array_steps = np.arange(0,20,2)
Return
[ 0 2 4 6 8 10 12 14 16 18]
Continuous Values
Create an array with 9 numbers between 0 and 10.
numpy_array_cnt_step = np.linspace(0,10,9)
Return
[ 0. 1.25 2.5 3.75 5. 6.25 7.5 8.75 10. ]
Static values
Although these functions seem to be rare, these arrays are often used for independent variables in predictive models.
Create an array filled with 10 zeros or 10 ones
numpy_zeros_2D = np.zeros(10)
numpy_zeros_2D = np.ones(10)
Create an array by choosing random values from a normal distribution (Gaussian)
numpy_random_float = np.random.randn(10)
Return
[ 0.51734484 -0.84932945 -1.16675107 -0.47245504 1.22452811 0.44823052 -1.13637128 -1.34081725 -0.38066925 1.99223281]
If you want to do the same with Integer numbers, you can just use the next function.
numpy_random_int = np.random.randint(1,5,10)
Return
[2 1 2 4 1 3 4 3 3 3]
Statistical calculations
If you want to obtain statistical results based on the array data, there are a lot of functions:
- max()
- min()
- mean()
- sum()
- std()
Moreover, if you want to perform operations between two or more arrays, you can also use mathematical operations such as '+', '-', '/'...
Indexing
We have our array, however, if we want to use its data we must use indexes.
Get value by index
# Array<1,3,7,1,6,7,1,3,0>
numpy_value = numpy_array[4]
Return
6
Get the first or last 3 positions
numpy_value = numpy_array[:3]
numpy_value = numpy_array[-3:]
If we want to get each one that match the condition, it is very similar to Pandas.
condition = numpy_array > 3
numpy_value = numpy_array[condition]
Return
[ 7,6,7 ]
Conclusion
Numpy is very easy to use and at the same time gives us the ability to manipulate very large lists more comfortably. Also, all the commands shown above can be used with N-dimensional matrix.