avatar

目录
pandas frame creation and insertion

Pandas frame creation

Create pandas frame using dictionary

Using normal dictionary data structure

Note that using the normal dictionary will not be able to ensure the order of columns in pandas frame.

Example below:

python
1
2
3
4
import pandas as pd
from tabulate import tabulate
def markdown_pd(df):
print(tabulate(df, tablefmt="pipe", headers="keys", showindex=True))
python
1
2
3
new_dict = {}
new_dict['foo'] = [1, 2, 3]
new_dict['bar'] = [4, 5, 6]
python
1
pdframe_normal_dict = pd.DataFrame(new_dict)
python
1
markdown_pd(pdframe_normal_dict)
bar foo
0 4 1
1 5 2
2 6 3

As you can see, the pandas frame does not have the order of columns that we want it to be.

In order to make it the correct order, we have to do:

python
1
pdframe_normal_dict.columns = ['foo', 'bar']

Because list has an order, now we can get:

python
1
markdown_pd(pdframe_normal_dict )
foo bar
0 4 1
1 5 2
2 6 3

Using orderedDict data structure

python
1
from collections import OrderedDict

Order dict has the order of the keys, so the pandas frame columns order will be ensured.

python
1
order_dict = OrderedDict()
python
1
order_dict['foo'] = [1, 2, 3]
python
1
order_dict['bar'] = [4, 5, 6]
python
1
order_dict
OrderedDict([('foo', [1, 2, 3]), ('bar', [4, 5, 6])])
python
1
pdframe_order_dict = pd.DataFrame(order_dict)
python
1
markdown_pd(pdframe_order_dict)
foo bar
0 1 4
1 2 5
2 3 6

As you can see, the order of column is what we want!

Create pandas frame from lists

python
1
import numpy as np

Create a column list: [‘foo’, ‘bar’]
Create a numpy array np.array([[1, 2, 3], [4, 5, 6]])

python
1
columns = ['foo', 'bar']
python
1
data = np.array([[1, 4], [2, 5], [3, 6]])
python
1
data.shape
(3, 2)
python
1
new_pdframe = pd.DataFrame(data=data, columns=columns)
python
1
markdown_pd(new_pdframe)
foo bar
0 1 4
1 2 5
2 3 6

insert column

unexplicit insert without defining column index

python
1
markdown_pd(new_pdframe)
foo bar
0 1 4
1 2 5
2 3 6
python
1
new_pdframe.loc[:, 'foz'] = [7, 8, 9]
python
1
markdown_pd(new_pdframe)
foo bar foz
0 1 4 7
1 2 5 8
2 3 6 9

in this case, the new column will be added at the most right of the data frame

explicit insert with defining column index

Let’s try insert the new column in the middle

python
1
new_pdframe = pd.DataFrame(data=data, columns=columns)
python
1
markdown_pd(new_pdframe)
foo bar
0 1 4
1 2 5
2 3 6
python
1
new_pdframe.insert(1,'foz', [7, 8, 9])
python
1
markdown_pd(new_pdframe)
foo foz bar
0 1 7 4
1 2 8 5
2 3 9 6

Note that you still can insert the new column at the end by the corresponding index.

python
1
new_pdframe = pd.DataFrame(data=data, columns=columns)
python
1
markdown_pd(new_pdframe)
foo bar
0 1 4
1 2 5
2 3 6
python
1
new_pdframe.insert(2,'foz', [7, 8, 9])
python
1
markdown_pd(new_pdframe)
foo bar foz
0 1 4 7
1 2 5 8
2 3 6 9

insert row

Inserting a row is not as easy as inserting a column

In order to make it more clear, let’s set the index of the pandas frame to some new values instead of 0, 1, 2.

python
1
new_pdframe = pd.DataFrame(data=data, columns=columns, index = ['a', 'b', 'c'])
python
1
markdown_pd(new_pdframe)
foo bar
a 1 4
b 2 5
c 3 6

As you can see, now the indices are characters instead of numbers.

insert by loc[new_index_value]

by setting a new index value, the new row will be appended at the end of the dataframe.

Note: the index must not be existing already!

python
1
new_pdframe.index
Index(['a', 'b', 'c'], dtype='object')
python
1
new_pdframe.loc['d'] = [4, 7]
python
1
markdown_pd(new_pdframe)
foo bar
a 1 4
b 2 5
c 3 6
d 4 7

insert by appending a row in a specific index position

In this case you wanna to insert a pandas frame, but to a specific position inside the pandas frame

python
1
2
3
4
5
6
7
def insert_row(idx, df, df_insert):
dfA = df.iloc[:idx, ]
dfB = df.iloc[idx:, ]

df = dfA.append(df_insert).append(dfB)

return df
python
1
new_pdframe = pd.DataFrame(data=data, columns=columns, index = ['a', 'b', 'c'])
python
1
markdown_pd(new_pdframe)
foo bar
a 1 4
b 2 5
c 3 6
python
1
appended_pdframe = pd.DataFrame(data=np.array([[5, 5], [6, 6]]), columns=columns, index = ['d', 'e'])
python
1
markdown_pd(appended_pdframe)
foo bar
d 5 5
e 6 6

Now let’s try adding it between b and c.

python
1
updated_frame = insert_row(2, new_pdframe, appended_pdframe)
python
1
markdown_pd(updated_frame)
foo bar
a 1 4
b 2 5
d 5 5
e 6 6
c 3 6

insert by pd.concat

In this case, the frame will be appended at the end of the frame.

python
1
updated_frame = pd.concat([new_pdframe, appended_pdframe])
python
1
markdown_pd(updated_frame)
foo bar
a 1 4
b 2 5
c 3 6
d 5 5
e 6 6

评论