pyexcel - Let you focus on data, instead of file formats

Author:

C.W.

Source code:

http://github.com/pyexcel/pyexcel.git

Issues:

http://github.com/pyexcel/pyexcel/issues

License:

New BSD License

Released:

0.7.3

Generated:

Apr 26, 2025

Introduction

pyexcel providesone unified API for reading, manipulating, and writing datain various Excel formats. It simplifies the processof handling Excel files, making it an enjoyable task. Data in Excel filescan be easily converted intoarrays or dictionarieswith minimal code, and vice versa. This library focusespurely on dataprocessing and does not address features like fonts, colors, or charts.

The idea behind pyexcel originated from a common usability problem: when Excel-drivenweb applications are delivered to non-developer users (e.g., project assistants, humanresources administrators), they often are not aware of the differencesbetween file formats such as CSV, XLS, and XLSX. Rather than training users on theseformats, pyexcel provides web developers with a unified interface to handle mostExcel file types.

To add support for a specific Excel format in your application, simply install anadditional pyexcel plugin—no code changes required. This eliminates issues withdifferent file formats. In the broader community, pyexcel and its associatedlibraries aim to be a simple, easy-to-install alternative to Pandas, where minimaldata manipulation is needed.

Support the project

If your company uses pyexcel and its components in a revenue-generating product,please consider supporting the project on GitHub orPatreon. Your financialsupport will enable me to dedicate more time to coding, improving documentation,and creating engaging content.

Installation

You can install pyexcel via pip:

$pipinstallpyexcel

or clone it and install it:

$gitclonehttps://github.com/pyexcel/pyexcel.git$cdpyexcel$pythonsetup.pyinstall

Suppose you have the following data in a dictionary:

Name

Age

Adam

28

Beatrice

29

Ceri

30

Dean

26

you can easily save it into an excel file using the following code:

>>>importpyexcel>>># make sure you had pyexcel-xls installed>>>a_list_of_dictionaries=[...{..."Name":'Adam',..."Age":28...},...{..."Name":'Beatrice',..."Age":29...},...{..."Name":'Ceri',..."Age":30...},...{..."Name":'Dean',..."Age":26...}...]>>>pyexcel.save_as(records=a_list_of_dictionaries,dest_file_name="your_file.xls")

And here’s how to obtain the records:

>>>importpyexcelasp>>>records=p.iget_records(file_name="your_file.xls")>>>forrecordinrecords:...print("%s is aged at%d"%(record['Name'],record['Age']))Adam is aged at 28Beatrice is aged at 29Ceri is aged at 30Dean is aged at 26>>>p.free_resources()

Custom data rendering:

>>># pip install pyexcel-text==0.2.7.1>>>importpyexcelasp>>>ccs_insight2=p.Sheet()>>>ccs_insight2.name="Worldwide Mobile Phone Shipments (Billions), 2017-2021">>>ccs_insight2.ndjson="""...{"year": ["2017", "2018", "2019", "2020", "2021"]}...{"smart phones": [1.53, 1.64, 1.74, 1.82, 1.90]}...{"feature phones": [0.46, 0.38, 0.30, 0.23, 0.17]}...""".strip()>>>ccs_insight2pyexcel sheet:+----------------+------+------+------+------+------+| year           | 2017 | 2018 | 2019 | 2020 | 2021 |+----------------+------+------+------+------+------+| smart phones   | 1.53 | 1.64 | 1.74 | 1.82 | 1.9  |+----------------+------+------+------+------+------+| feature phones | 0.46 | 0.38 | 0.3  | 0.23 | 0.17 |+----------------+------+------+------+------+------+

Advanced usage :fire:

If you are dealing with big data, please consider these usages:

>>>defincrease_everyones_age(generator):...forrowingenerator:...row['Age']+=1...yieldrow>>>defduplicate_each_record(generator):...forrowingenerator:...yieldrow...yieldrow>>>records=p.iget_records(file_name="your_file.xls")>>>io=p.isave_as(records=duplicate_each_record(increase_everyones_age(records)),...dest_file_type='csv',dest_lineterminator='\n')>>>print(io.getvalue())Age,Name29,Adam29,Adam30,Beatrice30,Beatrice31,Ceri31,Ceri27,Dean27,Dean

Two advantages of above method:

  1. Add as many wrapping functions as you want.

  2. Constant memory consumption

For individual excel file formats, please install them as you wish:

A list of file formats supported by external plugins

Package name

Supported file formats

Dependencies

pyexcel-io

csv, csvz[1], tsv,tsvz[2]

csvz,tsvz readers depends onchardet

pyexcel-xls

xls, xlsx(read only),xlsm(read only)

xlrd,xlwt

pyexcel-xlsx

xlsx

openpyxl

pyexcel-ods3

ods

pyexcel-ezodf,lxml

pyexcel-ods

ods

odfpy

Dedicated file reader and writers

Package name

Supported file formats

Dependencies

pyexcel-xlsxw

xlsx(write only)

XlsxWriter

pyexcel-libxlsxw

xlsx(write only)

libxlsxwriter

pyexcel-xlsxr

xlsx(read only)

lxml

pyexcel-xlsbr

xlsb(read only)

pyxlsb

pyexcel-odsr

read only for ods, fods

lxml

pyexcel-odsw

write only for ods

loxun

pyexcel-htmlr

html(read only)

lxml,html5lib

pyexcel-pdfr

pdf(read only)

camelot

Plugin shopping guide

Since 2020, all pyexcel-io plugins have dropped the support for python versionswhich are lower than 3.6. If you want to use any of those Python versions, please use pyexcel-ioand its plugins versions that are lower than 0.6.0.

Except csv files, xls, xlsx and ods files are a zip of a folder containing a lot ofxml files

The dedicated readers for excel files can stream read

In order to manage the list of plugins installed, you need to use pip to add or removea plugin. When you use virtualenv, you can have different plugins per virtualenvironment. In the situation where you have multiple plugins that does the same thingin your environment, you need to tell pyexcel which plugin to use per function call.For example, pyexcel-ods and pyexcel-odsr, and you want to get_array to use pyexcel-odsr.You need to append get_array(…, library=’pyexcel-odsr’).

Other data renderers

Package name

Supported file formats

Dependencies

Python versions

pyexcel-text

write only:rst,mediawiki, html,latex, grid, pipe,orgtbl, plain simpleread only: ndjsonr/w: json

tabulate

2.6, 2.7, 3.3, 3.43.5, 3.6, pypy

pyexcel-handsontable

handsontable in html

handsontable

same as above

pyexcel-pygal

svg chart

pygal

2.7, 3.3, 3.4, 3.53.6, pypy

pyexcel-sortable

sortable table in html

csvtotable

same as above

pyexcel-gantt

gantt chart in html

frappe-gantt

except pypy, sameas above

Footnotes

[1]

zipped csv file

[2]

zipped tsv file

For compatibility tables of pyexcel-io plugins, please clickhere

Plugin compatibility table

pyexcel

pyexcel-io

pyexcel-text

pyexcel-handsontable

pyexcel-pygal

pyexcel-gantt

0.6.5+

0.6.2+

0.2.6+

0.0.1+

0.0.1

0.0.1

0.5.15+

0.5.19+

0.2.6+

0.0.1+

0.0.1

0.0.1

0.5.14

0.5.18

0.2.6+

0.0.1+

0.0.1

0.0.1

0.5.10+

0.5.11+

0.2.6+

0.0.1+

0.0.1

0.0.1

0.5.9.1+

0.5.9.1+

0.2.6+

0.0.1

0.0.1

0.0.1

0.5.4+

0.5.1+

0.2.6+

0.0.1

0.0.1

0.0.1

0.5.0+

0.4.0+

0.2.6+

0.0.1

0.0.1

0.0.1

0.4.0+

0.3.0+

0.2.5

A list of supported file formats

file format

definition

csv

comma separated values

tsv

tab separated values

csvz

a zip file that contains one or many csv files

tsvz

a zip file that contains one or many tsv files

xls

a spreadsheet file format created byMS-Excel 97-2003

xlsx

MS-Excel Extensions to the Office Open XMLSpreadsheetML File Format.

xlsm

an MS-Excel Macro-Enabled Workbook file

ods

open document spreadsheet

fods

flat open document spreadsheet

json

java script object notation

html

html table of the data structure

simple

simple presentation

rst

rStructured Text presentation of the data

mediawiki

media wiki table

Usage

Suppose you want to process the following excel data :

Here are the example usages:

>>>importpyexcelaspe>>>records=pe.iget_records(file_name="your_file.xls")>>>forrecordinrecords:...print("%s is aged at%d"%(record['Name'],record['Age']))Adam is aged at 28Beatrice is aged at 29Ceri is aged at 30Dean is aged at 26>>>pe.free_resources()

Design

New tutorial

Old tutorial

Cook book

Real world cases

API documentation

Developer’s guide

Change log

Indices and tables