Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

SerpApi profile imageIllia Zub
Illia Zub forSerpApi

Posted on • Originally published atserpapi.com

     

Scrape Walmart Search for a specific store

Walmart responds with results for Sacramento for requests outside of the US.

image

But how to search for products that are available in a specific store? Any store on Walmart can be chosen without browser automation — only by setting relevant cookies in the plain HTTP request.

To figure out on your own, JS and browser dev tools knowledge will be enough. Some Ruby knowledge is required to understand this post.

Location cookies

I've updated location several times and checked the browser Dev Tools -> Application -> Cookies.

image

There are several cookies being updated after choosing a different location:locGuestData,locDataV3,assortmentStoreId;ACID,hasACID,hasLocData.

location-data also looks relevant but it contains postal code and address for a store I haven't chosen. Maybe it was used before Walmart migrated to GrapgQL API.

locDataV3 andlocGuestData are Base64 and URI-encoded JSON objects.locDataV3 contains more data thanlocGuestData. But data oflocGuestData can be used for both.

ACID is a UUID. It can be generated on the client.

hasACID andhasLocData are flags.

UnderstandinglocGuestData

Let's check what's inside this cookie value to understand how to set the store ID.

Example of encodedlocGuestData

When sending requests to Walmart,locGuestData is a Base64-encoded string.

eyJpbnRlbnQiOiJTSElQUElORyIsInN0b3JlSW50ZW50IjoiUElDS1VQIiwibWVyZ2VGbGFnIjp0cnVlLCJwaWNrdXAiOnsibm9kZUlkIjoiNDExNSIsInRpbWVzdGFtcCI6MTYzNzMyODUwMDUyM30sInBvc3RhbENvZGUiOnsiYmFzZSI6Ijc4MTU0IiwidGltZXN0YW1wIjoxNjM3MzI4NTAwNTIzfSwidmFsaWRhdGVLZXkiOiJwcm9kOnYyOjUyNzNlMDFjLTA4NzAtNGUwOS05ODU4LTAzYTI2ZDQ5N2ZhOSJ9
Enter fullscreen modeExit fullscreen mode

Example of decodedlocGuestData

This Base64 string is a encoded JSON object.

JSON.parse(decodeURIComponent(atob("eyJpbnRlbnQiOiJTSElQUElORyIsInN0b3JlSW50ZW50IjoiUElDS1VQIiwibWVyZ2VGbGFnIjp0cnVlLCJwaWNrdXAiOnsibm9kZUlkIjoiNDExNSIsInRpbWVzdGFtcCI6MTYzNzMyODUwMDUyM30sInBvc3RhbENvZGUiOnsiYmFzZSI6Ijc4MTU0IiwidGltZXN0YW1wIjoxNjM3MzI4NTAwNTIzfSwidmFsaWRhdGVLZXkiOiJwcm9kOnYyOjUyNzNlMDFjLTA4NzAtNGUwOS05ODU4LTAzYTI2ZDQ5N2ZhOSJ9"))){"intent":"SHIPPING","storeIntent":"PICKUP","mergeFlag":true,"pickup":{"nodeId":"4115","timestamp":1637328500523},"postalCode":{"base":"78154","timestamp":1637328500523},"validateKey":"prod:v2:5273e01c-0870-4e09-9858-03a26d497fa9"}
Enter fullscreen modeExit fullscreen mode

After changing Walmart store several times, I've seen thatnodeId andpostalCode.base are changing.

Generatetimestamp andacid forlocGuestData

timestamp andacid can be generated on every request.

timestamp=Time.now.to_iacid=SecureRandom.uuid
Enter fullscreen modeExit fullscreen mode

Base64-encode location data

Next, let's Base64-encode that JSON string as Walmart expects.

timestamp=Time.now.to_iacid=SecureRandom.uuidlocation_guest_data={intent:"SHIPPING",storeIntent:"PICKUP",mergeFlag:true,pickup:{nodeId:store_id,timestamp:timestamp},postalCode:{base:postal_code,timestamp:timestamp},validateKey:"prod:v2:#{acid}"}encoded_location_data=Base64.urlsafe_encode64(JSON.dump(location_guest_data))
Enter fullscreen modeExit fullscreen mode

Create cookie string

Finally, a location cookie string contains all the required fields.

%(ACID=#{acid}; hasACID=true; hasLocData=1; locDataV3=#{location_guest_data}; assortmentStoreId=#{store_id}; locGuestData=#{encoded_location_data})
Enter fullscreen modeExit fullscreen mode

Complete function to create Walmart location cookie

Putting all together.

deflocation_cookie(store_id,postal_code)returnifstore_id.blank?timestamp=Time.now.to_iacid=SecureRandom.uuidlocation_guest_data={intent:"SHIPPING",storeIntent:"PICKUP",mergeFlag:true,pickup:{nodeId:store_id,timestamp:timestamp},postalCode:{base:postal_code,timestamp:timestamp},validateKey:"prod:v2:#{acid}"}encoded_location_data=Base64.urlsafe_encode64(JSON.dump(location_guest_data))%(ACID=#{acid}; hasACID=true; hasLocData=1; locDataV3=#{location_guest_data}; assortmentStoreId=#{store_id}; locGuestData=#{encoded_location_data})end
Enter fullscreen modeExit fullscreen mode

Then make an HTTP request using the language and libraries you've chosen.

importgotfrom'got';constSTORE_ID="4115";constPOSTAL_CODE="78154";constlocationCookie=getLocationCookie(STORE_ID,POSTAL_CODE);consthtmlResponse=awaitgot('https://www.walmart.com/search?q=cookie',{headers:{cookie:locationCookie}});
Enter fullscreen modeExit fullscreen mode

image

Where to get store ID and postal code

Well, but we wouldn't hard-code store ID and postal code into the web scraping program. ACSV of 4.6k stores can be used to find and store ID dynamically.

Programmatic usage of CSV is out of the scope of this post. All that is needed is to read find store ID and postal code for a specific location in a table.

Updating a list of Walmart stores IDs and locations

Walmart provides several sources to find stores. Data can be populated from one of those sources:

Store Directory

Store Directory contains links on four levels: country, states, cities, and stores. To get the data, iterate over all elements on the specific level and make subsequent requests.

States

Assuming the country is the US, 51 states can be hard-coded. Walmart front-end requests data from the JSON endpointhttps://www.walmart.com/store/electrode/api/store-directory. It accepts thest search parameter.

Example:https://www.walmart.com/store/electrode/api/store-directory?st=AL.

It returns a list of cities. Each city object containscity, andstoreId orstoreCount. The city withstoreId contains a single store. The city withstoreCount contains multiple stores.

Single store in a city

Request to a specific store returns an HTML page. Example:https://www.walmart.com/store/5744.

image

Store address and postal code should be extracted from the HTML. Store ID is already in URI.

letpostalCode=document.querySelector(".store-address-postal[itemprop=postalCode]").textContent;letaddress=document.querySelector(".store-address[itemprop=address]").textContent;
Enter fullscreen modeExit fullscreen mode
Multiple stores in a city

Request for multiple stores returns a JSON response. Cities with a single store respond with an empty array ([]) so we have to parse HTML.

Example request for multiple stores

https://www.walmart.com/store/electrode/api/store-directory?st=AL&city=Decatur
Enter fullscreen modeExit fullscreen mode

Sample city from the response

{"displayName":"Neighborhood Market","storeName":"Neighborhood Market","address":"1203 6th Ave Se","phone":"256-822-6366","postalCode":"35601","storeId":2488}
Enter fullscreen modeExit fullscreen mode
Putting all together

Pseudo-code to collect store IDs and locations for all US states.

constSTATES=["AL","TX","CA",/* ... */];letwalmartStores=[];for(letstateofSTATES){letcities=get(`https://www.walmart.com/store/electrode/api/store-directory?st=${state}`);for(let{storeId,storeCount,city}ofcities){if(storeId&&!storeCount){letstore=get(`https://www.walmart.com/store/${storeId}`);letdocument=parseHTML(store);letpostalCode=document.querySelector(".store-address-postal[itemprop=postalCode]").textContent;letaddress=document.querySelector(".store-address[itemprop=address]").textContent;walmartStores.push({postalCode,address,storeId:storeId});}elseif(!storeId&&storeCount>0){letstores=get(`https://www.walmart.com/store/electrode/api/store-directory?st=${state}&city=${city}`);walmartStores.concat(stores);}}}csv.write("walmart_stores.csv",walmartStores);
Enter fullscreen modeExit fullscreen mode

Existing programs to scrape Walmart Stores

Search on GitHub via grep.app shows four relevant repositories

$curl-s https://raw.githubusercontent.com/akamai/edgeworkers-examples/master/edgecompute/examples/personalization/storelocator/data/locations.json | jq'.elements[].tags | select(."ref:walmart" != null) | .ref' |wc-l471
Enter fullscreen modeExit fullscreen mode
  • scrapehero/walmart_store_locator which scrapes stores by postal codes. But finding a list of actual postal codes turned out to be harder than finding a list of actual US states.

  • theriley106/WaltonAnalytics which is great to extract data from Walmart but not Walmart stores.

  • GUI/covid-vaccine-spotter which scrapes stores by postal codes. But finding a list of actual postal codes turned out to be harder than finding a list of actual US states.

So, I've played with Rust and came up withthis (rough) program.

After going through compilation errors, it worked well. Thanks to this helpfulblog post about async streams in Rust. Every time my program compiled, it actually worked. Fixing compilation errors is hard (for non-rustacean) but there's was no need to debug the program in runtime which is great.

Conclusion

Scraping Walmart is fairly easy — it contains inline JSON data for all products on the search results page.

image

Update location cookies to specify the location for plain HTTP requests to Walmart.

If you have anything to share, any questions, suggestions, or something that isn't working correctly, feel free to drop a comment in the comment section or reach out via Twitter at@ilyazub_, or@serp_api.

Yours,
Ilya, and the rest of the SerpApi Team.


Join us onReddit |Twitter |YouTube

Top comments(0)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss

Some comments may only be visible to logged-in visitors.Sign in to view all comments.

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

API to get search engine results with ease.

More fromSerpApi

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp