Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

WIP NUTCH-3064 Upgrade com.maxmind.geoip2:geoip2 dependency in geoip-index to v4.2.0#825

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
lewismc wants to merge2 commits intoapache:master
base:master
Choose a base branch
Loading
fromlewismc:NUTCH-3064

Conversation

@lewismc
Copy link
Member

@lewismclewismc commentedSep 12, 2024
edited
Loading

Work in Progress

This PR begins to addressNUTCH-3064 by performing the upgrade of the com.maxmind.geoip2:geoip2 dependency to v4.2.0. It has not been tested in distributed Nutch deployment as of yet. I say this because although no additional dependencies have been added I will wish to test out a full deployment.

In addition to the proposed upgrade I performed some refactoring which I considered to be improvements.

Refactoring/Improvements

  1. Establishes unit test(s). I have more work to do here to accommodate the change in logic for loading the maxmind db file(s) from the class path.
  2. Removes duplication of configuration documentation, including it only innutch-default.xml.
  3. RemovesinsightsService as the default value for theindex.geoip.usage configuration property. The value is now empty.
  4. Introduces a new propertyindex.geoip.db.file which facilitates specifying the Maxmind DB file packaged with Nutch.job.
  5. Adds Javadoc to every Class and Method of the index-geoid plugin (more work to be done here)
  6. Uses theupdated GeoIP Database guidance, specifically
  • Using thetry methods; "...If you are looking up many IPs that are not contained in the database, the try method will be slightly faster as they do not need to construct and throw an exception."
  • UsesDB Caching; "... Using this cache, lookup performance is significantly improved at the cost of a small (~2MB) memory overhead."
  1. Updates the number of fields which are now available for each Database as new fields h ave been added to the Java API since I first wrote this plugin.
  2. Simplifies the values available for theindex.geoip.usage configuration property. Available values are nowanonymous,asn,city,connection,domain,insights orisp.THIS IS A BACKWARDS INCOMPATIBLE BREAKING CHANGE which we would need to call out in the release notes. I decided to implement this changebased on recent feedback which I agree with btw.

Future work

I can anticipate a use case where multipleMaxmind DB's and/orWeb service looksups may wish to bechained together with the results being aggregated within oneNutchDocument. I did not wish to complicate this PR any more though so any implementation will be described first in another Jira ticket.

@lewismclewismc marked this pull request as draftSeptember 12, 2024 23:56
@lewismclewismc self-assigned thisSep 13, 2024
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

@lewismclewismc

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

1 participant

@lewismc

[8]ページ先頭

©2009-2025 Movatter.jp