Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Stackexchange (e.g., stackoverflow) data dump converter from XML to CSV format.

License

NotificationsYou must be signed in to change notification settings

SkobelevIgor/stackexchange-xml-converter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CLI tool that allows you to convertStack Exchange data dumps fromXML toCSV orJSON formats, which is more suitable for importing to the different databases.

Table of contents

RDBMS schema examples

Here you can find the examples of the schema for the different databases:

Getting started

Before, ensure that you have:

  • WorkingGo environment with go version >= 1.14. Execute in the consolego version command. It should display the current version of the compiler.
  • Archiver that can extract.7z files. Possible candidate is7z.

Download database dump

Choose and download thedatabase dump that you are going to convert.

Important: Stackoverflow dump stored in 8 separated 7z archives:

Extract

Extract archive(s) content file(s) to the directory from where you will convert XML files.

Example withacademia.stackexchange.com.7z dump:

$ mkdir xml csv$ 7z e academia.stackexchange.com.7z -oxml$ ls xml/Badges.xml  Comments.xml  PostHistory.xml  PostLinks.xml  Posts.xml  Tags.xml  Users.xml  Votes.xml

Build the stackexchange-xml-converter

Clone & build stackexchange-xml-converter converter:

$ git clone https://github.com/SkobelevIgor/stackexchange-xml-converter$cd stackexchange-xml-converter/$ go build

XML to CSV/JSON converting

Now you have thestackexchange-xml-converter executable file. Let’s convert XML files to the CSV format:

./stackexchange-xml-converter -result-format=csv -source-path=../xml -store-to-dir=../csv

List of possible flags:

  • result-format (Required) Result format (csv or json)
  • source-path (Required) Absolute or relative path to the directory with an XML file(s) or to the separate XML file.
  • store-to-dir (Optional) Absolute or relative path to the directory where to store result CSV files.
  • skip-html-decoding (Optional) Some of the files (e.g., Posts.xml) contain escaped HTML. By default, the converter will decode them. To disable this behavior, use this flag.

License

MIT License

About

Stackexchange (e.g., stackoverflow) data dump converter from XML to CSV format.

Topics

Resources

License

Stars

Watchers

Forks

Contributors2

  •  
  •  

Languages


[8]ページ先頭

©2009-2025 Movatter.jp