Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Basic stand-alone disk-based N-way merge sort component for Java

License

NotificationsYou must be signed in to change notification settings

cowtowncoder/java-merge-sort

Repository files navigation

This project implements basic disk-backed multi-way merge sort, with configurable input and output formats (i.e. not just textual sort).It should be useful for systems that process large amounts of data, as a simple building block for sort phases.

Documentation

Checkoutproject wiki for more documentation, including Javadocs.

License

Library is licensed underApache License 2.0.

JDK Requirement

Version 1.1.0 (released on 2022-11-19) requires Java 8.

Earlier versions (1.0.2 and before) require Java 6.

Usage

Programmatic access

Main class to interact with iscom.fasterxml.sort.Sorter, which needs to be constructed with four things:

  • Configuration settings (defaultSortConfig works fine)
  • DataReaderFactory which is used for creating readers for intermediate sort files (and input, if stream passed)
  • DataWriterFactory which is used for creating writers for intermediate sort files (and results, if stream passed)
  • Comparator for data items

An example of how this can be done can be found fromcom.fasterxml.sort.std.TextFileSorter.Basic implementations exist for line-based text input (in packagecom.fasterxml.sort.std), and additional implementations may be added: for example, a JSON data sorter could be implement as an extension module ofJackson.Fortunately implementing your own readers and writers is trivial.

With a Sorter instance, you can call one of two main sort methods:

publicvoidsort(InputStreamsource,OutputStreamdestination)publicbooleansort(DataReader<T>inputReader,DataWriter<T>resultWriter)

where former takes input as streams and uses configured reader/writer factories to constructDataReader for input andDataWriter for output; and latter just uses pre-constructed instances.

In addition to core sorting functionality,Sorter instance also gives access to progress information (it implementsSortingState interface with accessor methods).

A very simple example of sorting a text file using line-by-line comparison is:

TextSortersorter =newTextFileSorter(newSortConfig().withMaxMemoryUsage(20 *1000 *1000));sorter.sort(newFileInputStream("input.txt"),newFileOutputStream("output.txt"));

which would read text from file "input.txt", sort using about 20 megs of heap (note: estimates for memory usage are rough), use temporary files if necessary (i.e. for small files it's just in-memoryu sort, for bigger real merge sort), and write output as file "output.txt".

Command-line utility

Project jar is packaged such that it can be used as a primitive 'sort' tool like so:

java -jarjava-merge-sort-1.1.0.jar [input-file]

where sorted output gets printed tostdout; and argument is optional (if missing, reads input from stdout).(implementation note: this uses standardTextFileSorter mentioned above)

Format is assumed to be basic text lines, similar to unixsort, and sorting order basic byte sorting (which works for most common encodings).

More documentation

Here are some external links:

Getting involved

To access source, just cloneproject

About

Basic stand-alone disk-based N-way merge sort component for Java

Resources

License

Stars

Watchers

Forks

Sponsor this project

 

Packages

No packages published

Contributors7

Languages


[8]ページ先頭

©2009-2025 Movatter.jp