How to automate the operation of Excel with python 【Introduction for beginners】

Python
この記事は約14分で読めます。

VBA is probably the first choice when considering automation in Excel.

However, VBA I can not do at all. .. And I have no desire to learn. .. I wonder why I am motivated for some reason rather than being told to do ladder or C… I am not disappointed, and I know well that people who are actually used for work are utilized I think it’s a hobby issue, though. ..

So, I will introduce how to operate Excel using python. The module used is a module for operating Excel called OpenPyXL.

Advertisements

What is OpenPyXL?

OpenPyXL is a module for python operations such as Excel reading and writing.

You can actually do the following operations by OpenPyXL.

what we can by OpenPyXL
  • Create Excel file
  • Read existing Excel sheet data (read/write cell value)
  • Setting formula
  • Copying datas
  • Creating charts
  • Link with Pandas and Numpy

and so on… などなど、Other formats (style related) and filters can also be handled. I rarely use it, but can also read pivot-table (can not create it)

Excel file that can be operated by OpenPyXL

OpenPyXl can support Excel files in .xlsx format. Even free software libra office can be saved in xlsx format, so the point is that the file format is .xlsx.

On the contrary, .xls cannot be supported. If you want to use an Excel file with extension before Excel 2003 (extension is .xls), you should consider using xlrd module.

Or, it is said that there are .xls left at this time, but sometimes they are sent to me, so I think they are still alive (only in Japan??).

How to install OpenPyXL

If you are using pip( or pip3) for module management, please install in the following form

#using pip

pip install openpyxl

#using pip3

pip3 install openpyxl

In case of using Anaconda(conda),

Anaconda(conda)の場合は、I think it was installed from the beginning. (I’m not using it these days, so I’m not sure about it.)

How to read excel files by OpenPyXL

Let’s read some excel file anyway.

Read Excel file and get the sheet name list

from openpyxl import load_workbook

path = "absolute Path or relative Path"
ss = load_workbook(path)
print(ss.sheetnames)

About the location of the read file

I’ve looked at several sites and found that one needs to move (copy) the .xlsx file to the same directory as the environment in which Python is running.

I think it is not nessesary. If you pass the absolute path, it will work without any problems.

needs to move (copy) the .xlsx file to the same directory as the environment in which Python is running.

Get workbook

openpyxl.load_workbook(file_path)

For getting “workbook obj”, using load_wokbook method. The argument is path or file name (if they are in the same folder).

similar as Spreadsheet object in GAS (object obtained by openbyid method etc.)

Get sheet name

workbook obj.sheetnames

If you take the sheetnames property for the workbook object (which is the object obtained by load_workbook), a list of sheet names will be returned as an array.

Responses as bellow images.

['sheet1', 'sheet2', 'sheet3']

Occasionally, there is an article (probably an old article) that introduces getting a list of sheet names with the get_sheet_names method, but currently the get_sheet_names method is deprecated, so it is basic to use the sheetnames property.

openpyxl.workbook.workbook module — openpyxl 3.0.5 documentation

sheets getting/creating/copying/deleting

It is about basic sheet operation.

Getting sheets object

You can get sheet object by specifying the sheet name in the Workbook object.

path = "absolute path or relative path"
ss = load_workbook(path)
sheet1 = ss['sheet1']

similar as sheets object in GAS(object obtained by getSheetByName method etc.)

Creating sheets

You can create a sheet with the create_sheet method of the Workbook object. You can specify the sheet name as the first argument and the position to insert the sheet as an index in the second argument. Both arguments are optional.

Workbook obj.create_sheet(‘sheet name’, index)

path = "absolute path or relative path"
ss = load_workbook(path)
ss.create_sheet('Additional Sheet', 0)

Copying sheets

Sheet copy can be done with copy_worksheet method of Workbook object. Specify the Worksheet object to be copied in the argument.

Workbook obj.copy_worksheet(worksheet obj)

path = "absolute path or relative path"
ss = load_workbook(path)
sheet1 = ss['sheet1']
sheet2 = ss.copy_worksheet(sheet1)

Deleting sheets

To delete a sheet, use the remove method of Workbook object. The argument specifies the Worksheet object you want to delete.

Workbook obj.remove(worksheet obj)

path = "absolute path or relative path"
ss = load_workbook(path)
sheet1 = ss['sheet1']
sheet2 = ss.remove(sheet1)

Getting/editing single cell/multiple cells

Get single cell with value

If you get a single cell, you can get it with the cell name (A1 notation). If you want to get the cell value, you can get it with cell obj.value

path = "absolute path or relative path"
ss = load_workbook(path)
sheet1 = ss['sheet1']

b2_cell = sheet1['b2']
print(b2_cell.value)

Editing single cell

Once you get a single cell, you can edit the single cell by replacing the value property.

path = "absolute path or relative path"
ss = load_workbook(path)
sheet1 = ss['sheet1']

b2_cell = sheet1['b2']
b2_cell.value = "added from openpyxl"
print(be_cell.value)

Edit directly from sheet object with cell method

In addition to the method of specifying the cell and acquiring the value, it is also possible to acquire the value by directly specifying the row/column by using the cell method from the sheet object.

worksheet obj.cell(row, column, value=”value”)

value is optional. If you do not want to insert a value and want to get a cell object, you can do without specifying value.

It is recommended that you set it with keyword arguments so that no mistakes will occur.

Get multiple cells with values

There are two ways to specify the cell range: (1) specify with A1 notation (2) specify with row/column range.

(1) specify with A1 notation

path = "absolute path or relative path"
ss = load_workbook(path)
sheet1 = ss['sheet1']

B2toC5_cells = sheet1['b2':'c5']
for row in B2toC5_cells:
    for cell in row:
        print(cell.value)

(2) specify with row/column range

path = "absolute path or relative path"
ss = load_workbook(path)
sheet1 = ss['sheet1']
#行単位で指定
cells1 = sheet1[10]
cells2 = sheet1[5:10]
#列単位で指定
cells3 = sheet1['A']
cells4 = sheet1['A':'C']

Editing multiple cells

The following script will set the same value all at once, but the image is as follows. Actually, we will give conditional branch and execute。

path = "absolute path or relative path"
ss = load_workbook(path)
sheet1 = ss['sheet1']

B2toC5_cells = sheet1['b2':'c5']
for row in B2toC5_cells:
    for cell in row:
        cell.value = "modified from openpyxl"

Saving Excel files

Specify the save destination path in the save method of Workbook object. When creating a new file, specify a new path, and when overwriting, specify an existing path. Basically, I think it’s better to have the absolute path specified.

workbook obj.save(‘path’)

Summary

It would have been very easy to imagine if you had a little experience with GAS and python.

GAS users are grateful that the creation is similar to SpreadsheetApp.

It may be said that you should use a spreadsheet, but there are some things such as scraping that python is more compatible than GAS, so it would be nice if openpyxl can be used in local Excel.

If you use sheet API, you can upload Excel file and write data. Please read the following articles if you are interested.

コメント