Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

CSV Validation Tool and API (CSV Schema RI)

License

NotificationsYou must be signed in to change notification settings

digital-preservation/csv-validator

Repository files navigation

A Validation Tool and APIs for validating CSV (Comma Separated Value) files by usingCSV Schema.

CI

Released under theMozilla Public Licence version 2.0.

Acomprehensive user guide is available in GitHub pages, along with a morecomplete specification of the CSV Schema language.

Technology

The Validation tool and APIs are written in Scala 2.13 and may be used as:

  • A stand-alone command line tool.

  • A desktop tool, we provide a simple Swing GUI.

  • A library in your Scala project.

  • A library in your Java project (We provide a Java 11 interface, to make things simple for Java programmers too).

The Validation Tool and APIs can be used on any Java Virtual Machine which supports Java 11 or better (NB Java 6 support was removed in version 1.1). The source code isbuilt using theApache Maven build tool:

  1. For use in other Java/Scala Applications, build by executingmvn clean install.
  2. For the Command Line Interface or Swing GUI, build by executingmvn clean package.

Maven Artifacts

Released Maven Artifacts can be found in Maven Central under the groupIduk.gov.nationalarchives.

Java API

If you wish to use the CSV Validator from your own Java project, we provide a native Java API, the dependency details are:

<dependency><groupId>uk.gov.nationalarchives</groupId>    <artifactId>csv-validator-java-api</artifactId>    <version>1.4.0</version></dependency>

The Javadoc, can be found in either Maven Central or you can build it locally by executingmvn javadoc:javadoc.

Example Java code of using the CSV Validator through the Java API:

CharsetcsvEncoding =JCharset.forName("UTF-8");// default is UTF-8booleanvalidateCsvEncoding =true;CharsetcsvSchemaEncoding =JCharset.forName("UTF-8");// default is UTF-8booleanfailFast =true;// default is falseList<Substitution>pathSubstitutions =newArrayList<Substitution>();// default is any empty ArrayListbooleanenforceCaseSensitivePathChecks =true;// default is falsebooleantrace =false;// default is falseProgressCallbackprogress;// default is nullbooleanskipFileChecks =true;// default is falseintmaxCharsPerCell =8096;// default is 4096// add a substitution pathpathSubstitutions.add(newSubstitution("file://something","/home/xxx"));CsvValidator.ValidatorBuildervalidateWithStringNames =newCsvValidator.ValidatorBuilder("/home/dev/IdeaProjects/csv/csv-validator/csv-validator-core/data.csv","/home/dev/IdeaProjects/csv/csv-validator/csv-validator-core/data-schema.csvs" )// alternatively, you can pass in Readers for each fileReadercsvReader =newReader();ReadercsvSchemaReader =newReader();CsvValidator.ValidatorBuildervalidateWithReaders =newCsvValidator.ValidatorBuilder(csvReader,csvSchemaReader )List<FailMessage>messages =validateWithStringNames   .usingCsvEncoding(csvEncoding,validateCsvEncoding)// should only be `true` if using UTF-8 encoding, otherwise it will throw an exception   .usingCsvSchemaEncoding(csvSchemaEncoding)   .usingFailFast(failFast)   .usingPathSubstitutions(pathSubstitutions)   .usingEnforceCaseSensitivePathChecks(enforceCaseSensitivePathChecks)   .usingTrace(trace)   .usingProgress(progress)   .usingSkipFileChecks(skipFileChecks)   .usingMaxCharsPerCell(maxCharsPerCell)   .runValidation();if(messages.isEmpty()) {System.out.println("All worked OK"); }else {for(FailMessagemessage :messages) {if(messageinstanceofWarningMessage) {System.out.println("Warning: " +message.getMessage());     }else {System.out.println("Error: " +message.getMessage());     }   } }}

Scala API

Likewise, if you wish to use the CSV Validator from your own Scala project, the Scala API is part of the core, the dependency details are:

<dependency><groupId>uk.gov.nationalarchives</groupId>    <artifactId>csv-validator-core</artifactId>    <version>1.3.0</version></dependency>

The Scaladoc, can be found in either Maven Central or you can build it locally by executingmvn scala:doc.

An example of using the Scala API can be found in the classuk.gov.nationalarchives.csv.validator.api.java.CsvValidatorJavaBridge from thecsv-validator-java-api module. The Scala API at present gives much more control over the individual Schema Parsing and Validation Processorthan the Java API.

Schema Examples

Examples of CSV Schema can be found in the test cases of thecsv-validator-core module. See the*.csvs files inacceptance/. Schemas used by the Digital Preservation department at The National Archives are also available in theexample-schemas folder of thecsv-schema repository.

Current Limitations of the CSV Validator Tool

The CSV Validator implements almost all ofCSV Schema 1.2 (Draft) language, current limitations and missing functionality are:

  • DateExpr is not yet fully implemented (may raise Schema check error).

  • PartialDateExpr is not yet implemented (raises Schema check error).

  • At leastMD5,SHA-1,SHA-2,SHA-3, andSHA-256 checksum algorithms are supported. Probably many more as well as we defer to Java'sjava.security.MessageDigest class.


[8]ページ先頭

©2009-2025 Movatter.jp