Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Cover image for Our experience at Google hashcode 2020
Helene Jonin
Helene Jonin

Posted on

     

Our experience at Google hashcode 2020

We are four developers atOpenAirlines, a startup helping airlines reducing their fuel consumption and CO2 emissions, based in Toulouse, France. The 20th of February, we participated in the 1st qualification round of theGoogle hashcode coding competition.

The event took place from 5:30pm to 9:30pm UTC, 4 hours during which we had to solve anoptimization problem. We were competing as a team from a local hub,Toulouse GDG, with other teams and of course beer, chips & pizzas.

Let me explain you the process we underwent to try to solve the problem.

Problem

As Google did to fill up its Google Books database, we had to find the best process to scanbooks (with ascoreS) from a number oflibraries (L) given some constraints and to get the highest score (the sum of the scores of all books that are scanned withinDdays).

Each library has anumber of books (B), anumber of books that can be scanned each day (M) and asign up time for scanning (T).
Libraries can only have one copy of a book but many libraries can have a copy of the same book. When multiple copies of the same books are scanned, their score are only counted once. Only one library can sign up at a time.

Given six different input datasets of different sizes, we had to provide for each one of them an ordered list of libraries to sign up along the ordered list of their books to scan.

You can find the problem statementhere when it is published.

Resolution

Of course, the aim of the exercise was not to compute every options; there are too many: all combinations of libraries combined with all combinations of their books! So, we had to code an algorithm that would lead us to the best approaching result.

Our first thought was togive libraries a rank and select the top ranked libraries to scan from within the time frame.

We knew that this rank would be based on the library sign up time, its scanning rate, its number of books, and the total score of its books.

First of all, we wrote someboilerplate code in Kotlin to parse input files and design our objects that would ease later iterations.

The base metric for rank

Here, the rank is themaximum score a library can get within a time frame.

We computed it as follow:

  • Compute the maximum number of books that can be scanned from a library within a time frame

maxBooks = (d - T) * M with d the time frame, d0 = D

  • Sort library books by decreasing score

  • Add the score of each book until the maximum is reached or there are no more books

maxScore += orderedBooks[i].score with i from 0 to min(B, maxBooks)

The whole algorithm was, until there are no more time or libraries left:

  • Compute all library ranks
  • Pick the best ranked library and add it to the result
  • Re-compute the rank for other with a new time frame substracted from the picked library sign up time(d = d - T)

This first approach allowed us to obtain a reasonable score for all input datasets in a good amount of time. We submitted a first version in less than 2h, reaching a score of 16,000,000.

First optimization

With this method the same book may be scanned twice from different libraries, thus not increasing our final score. A simple optimization was to consider these duplicates bykeeping a cache of already scanned books and remove them from the library score when computing its new rank.

maxScore += orderedBooks[i].score with i from 0 to min(B, maxBooks) if orderedBooks[i] not already scanned

We were able to gain a few points more only (nearly 20,000,000 score).

Towards a ratio

At that point we started running out of precise ideas. We decided to improve the rank byleveraging it instead of a maximum) with the number of days it would take the libraries to scan the books we take into account in the rank computation.

newRank = rank / i

Where we got lost

We were running out of ideas, so we tried to focus on particular cases.

For instance, all libraries of one of the input dataset (d) had the same sign up time, the same scanning rate and all books with the same score. In this case, the rank could just match the count of not already scanned books.

A better ratio

Later at the bar (a little too late for the competition), drinking one last beer we found out that our ratio was not a good choice. We should have cut up the sign up time instead, because of the blocking time it was inducing. Indeed, the more time a library takes to sign up, the more it should decrease its rank.

We finally reached 25,000,000, which is the score you'll get running our code you can find ongithub.

Some statistics:

InputtotalDaysscoretheoric max scorebooks deliveredtheoric max bookslibs deliveredtheoric max libs
a.txt721216622
b.txt10005,822,90010,000,00058,229100,00090100
c.txt100,0003,695,58930,076,41511,995100,00081210,000
d.txt300015,028,0105,109,00077,35478,60015,00030,000
e.txt2005,034,89712,548,64829,383100,0001551000
f.txt7005,308,03440,111,14212,993100,000171000

Example of display for input dataset b:

First column is the score of the library,O is for signup,v when library is scanning books,- for blocked signup.

099900: Ovvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv099800: -Ovvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv099600: --OOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv099400: ----OOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv099200: ------OOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv099000: --------OOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv098800: ----------OOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv098500: ------------OOOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv098200: ---------------OOOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv097900: ------------------OOOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv097600: ---------------------OOOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv097200: ------------------------OOOOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv096800: ----------------------------OOOOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv096400: --------------------------------OOOOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv096000: ------------------------------------OOOOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv095600: ----------------------------------------OOOOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv095100: --------------------------------------------OOOOOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv094600: -------------------------------------------------OOOOOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv094100: ------------------------------------------------------OOOOOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv093600: -----------------------------------------------------------OOOOOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv093000: ----------------------------------------------------------------OOOOOOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv092400: ----------------------------------------------------------------------OOOOOOvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv091800: ----------------------------------------------------------------------------OOOOOOvvvvvvvvvvvvvvvvvvvvvvvvvv091200: ----------------------------------------------------------------------------------OOOOOOvvvvvvvvvvvvvvvvvvvv090500: ----------------------------------------------------------------------------------------OOOOOOOvvvvvvvvvvvvv089800: -----------------------------------------------------------------------------------------------OOOOOOOvvvvvv
Enter fullscreen modeExit fullscreen mode

Further ideas

We had some ideas we didn't have time to explore during the limited time of the contest... We computed some statistics on libraries, books... We also tried to display our results, and being able to have a look at them was very interesting: we are now thinking aboutcombining and iterating on the metrics, taking into account the remaining time to start the signing up of some libraries later...

Final words

At the end, our score placed us in the first half of all competitors around the world, which is not bad, but we mainly had a lot of fun and confirmed we were working out very good as a team!

Top comments(0)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

  • Location
    Toulouse
  • Work
    Software Engineer at openairlines
  • Joined

Trending onDEV CommunityHot

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp