Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Fast, robust Google Flights scraper (API) for Python. (Probably)

License

NotificationsYou must be signed in to change notification settings

AWeirdDev/flights

Repository files navigation

Apparently, it's always a better approach to interact with the Internal Google APIs. I'm working on that, and I'll deliver the results soon if my experimental project works out well.



The fast and strongly-typed Google Flights scraper (API) implemented in Python. Based on Base64-encoded Protobuf string.

DocumentationIssuesPyPi

$ pip install fast-flights

Basics

TL;DR: To usefast-flights, you'll first create a filter (for?tfs=) to perform a request.Then, addflight_data,trip,seat,passengers to use the API directly.

fromfast_flightsimportFlightData,Passengers,Result,get_flightsresult:Result=get_flights(flight_data=[FlightData(date="2025-01-01",from_airport="TPE",to_airport="MYJ")    ],trip="one-way",seat="economy",passengers=Passengers(adults=2,children=1,infants_in_seat=0,infants_on_lap=0),fetch_mode="fallback",)print(result)# The price is currently... low/typical/highprint("The price is currently",result.current_price)

Properties & usage forResult:

result.current_price# Get the first flightflight=result.flights[0]flight.is_bestflight.nameflight.departureflight.arrivalflight.arrival_time_aheadflight.durationflight.stopsflight.delay?# may not be presentflight.price

Useless enums: Additionally, you can use theAirport enum to search for airports in code (as you type)! See_generated_enum.py in source.

Airport.TAIPEI              ╭─────────────────────────────────╮              │TAIPEI_SONGSHAN_AIRPORT         │              │TAPACHULA_INTERNATIONAL_AIRPORT │              │TAMPA_INTERNATIONAL_AIRPORT     │              ╰─────────────────────────────────╯

What's new

  • v2.0 – New (much more succinct) API, fallback support for Playwright serverless functions, anddocumentation!
  • v2.2 - Now supportslocal playwright for sending requests.

Cookies & consent

The EU region is a bit tricky to solve for now, but the fallback support should be able to handle it.

Contributing

Contributing is welcomed! I probably won't work on this project unless there's a need for a major update, but boy howdy do I love pull requests.


How it's made

The other day, I was making a chat-interface-based trip recommendation app and wanted to add a feature that can search for flights available for booking. My personal choice is definitelyGoogle Flights since Google always has the best and most organized data on the web. Therefore, I searched for APIs on Google.

🔎Search
google flights api

The results? Bad. It seems like they discontinued this service and it now lives in the Graveyard of Google.

🧏‍♂️duffel.com
Google Flights API: How did it work & what happened to it?

The Google Flights API offered developers access to aggregated airline data, including flight times, availability, and prices. Over a decade ago, Google announced the acquisition of ITA Software Inc. which it used to develop its API.However, in 2018, Google ended access to the public-facing API and now only offers access through the QPX enterprise product.

That's awful! I've also looked for free alternatives but their rate limits and pricing are just 😬 (not a good fit/deal for everyone).


However, Google Flights has their UI –flights.google.com. So, maybe I could just use Developer Tools to log the requests made and just replicate all of that? Undoubtedly not! Their requests are just full of numbers and unreadable text, so that's not the solution.

Perhaps, we could scrape it? I mean, Google allowed many companies likeSerpapi to scrape their web just pretending like nothing happened... So let's scrape our own.

🔎Search
google flightsapi scraper pypi

Excluding the ones that are not active, I came acrosshugoglvs/google-flights-scraper on Pypi. I thought to myself: "aint no way this is the solution!"

I checked hugoglvs's code onGitHub, and I immediately detected "playwright," my worst enemy. One word can describe it well: slow. Two words? Extremely slow. What's more, it doesn't even run on the🗻 Edge because of configuration errors, missing libraries... etc. I could just reversetry.playwright.tech and use a better environment, but that's just too risky if they added Cloudflare as an additional security barrier 😳.

Life tells me to never give up. Let's just take a look at their URL params...

https://www.google.com/travel/flights/search?tfs=CBwQAhoeEgoyMDI0LTA1LTI4agcIARIDVFBFcgcIARIDTVlKGh4SCjIwMjQtMDUtMzBqBwgBEgNNWUpyBwgBEgNUUEVAAUgBcAGCAQsI____________AZgBAQ&hl=en
ParamContentMy past understanding
hlenSets the language.
tfsCBwQAhoeEgoyMDI0LTA1LTI4agcIARID…What is this???? 🤮🤮

I removed the?tfs= parameter and found out that this is the control of our request! And it looks so base64-y.

If we decode it to raw text, we can still see the dates, but we're not quite there — there's too much unwanted Unicode text.

Or maybe it's some kind of adata-storing method Google uses? What if it's something like JSON? Let's look it up.

🔎Search
google's json alternative

🐣Result
Solution: The Power ofProtocol Buffers

LinkedIn turned to Protocol Buffers, often referred to asprotobuf, a binary serialization format developed by Google. The key advantage of Protocol Buffers is its efficiency, compactness, and speed, making it significantly faster than JSON for serialization and deserialization.

Gotcha, Protobuf! Let's feed it to an online decoder and see how it does:

🔎Search
protobuf decoder

🐣Result
protobuf-decoder.netlify.app

I then pasted the Base64-encoded string to the decoder and no way! It DID return valid data!

annotated, Protobuf Decoder screenshot

I immediately recognized the values — that's my data, that's my query!

So, I wrote some simple Protobuf code to decode the data.

syntax = "proto3"messageAirport {stringname=2;}messageFlightInfo {stringdate=2;Airportdep_airport=13;Airportarr_airport=14;}messageGoogleSucks {repeatedFlightInfo=3;}

It works! Now, I won't consider myself an "experienced Protobuf developer" but rather a complete beginner.

I have no idea what I wrote but... it worked! And here it is,fast-flights.


(c) 2024-2025 AWeirdDev, and other awesome people


[8]ページ先頭

©2009-2025 Movatter.jp