Any Excel gurus here?

User avatar
elecplus

27 Jun 2017, 06:20

I have an old Excel 97 book with an example program to track stock prices. You entered the stocks you wanted, and the URLs of the sites you wanted to get prices from. It made a historical record, including charts and graphs and things. Could the same thing be used for mech kbds? Anybody want to tackle this?

Rimrul

27 Jun 2017, 18:56

I don't think any websites really keep a record of historic prices for specific keyboards. Stock prices are a different deal in that regard.

User avatar
elecplus

28 Jun 2017, 01:52

Right, we would be creating our own historical records, from stock prices listed, or sale dates on ebay or mm or DT or where ever we can find them.

User avatar
Ir0n

28 Jun 2017, 01:56

this is a neat idea lol it'd be a quick and easy way to see the value of something..

Grubb

04 Jul 2017, 15:39

I had a shot at this. Probably a ton of things that can still be improved but it's a start. All data is manually entered so not much automation yet. Feel free to share and edit the document.

https://docs.google.com/spreadsheets/d/ ... sp=sharing

User avatar
Scarpia

04 Jul 2017, 21:37

Off the top of my head, I don't think Excel (or even Google Sheets) is a great tool for the job. With a bit of macro code you *can* connect to web sites or APIs to pull data (I have done this before to allow non-coding-savvy colleagues to pull data from APIs on demand), but it's not very manageable, and you can't really build a full scraper that way.

For a more flexible (and still simple) solution, I'd recommend using a web scraping library combined with a simple database (one or two simple tables should do for a start - in fact you could use a Google Sheet as a makeshift database for this) and a crontab to schedule periodic scraping. All you'd need to do is write a scraper definition/configuration for each source you'd want to grab data from, and let it run.

Note that you will run into nasty deduplication issues, since unlike stocks (which have unique identifiers like AAPL or GOOG) keyboards are listed in completely haphazard ways, which means your database ends up containing entries that are actually the same keyboard model, but have different descriptions:

2017-06-27 | ebay.com | $99.90 | Apple M0118 keyboard
2017-06-27 | ebay.com | $49.50 | Apple keyboard vintage
2017-06-28 | ebay.com | $79.00 | Appel keyboard vintage
2017-07-01 | ebay.com | $56.00 | Keyboard Apple Alps Cherry retro clicky best price !!!!1111
2017-07-03 | ebay.com | $45.00 | Apple M0118 vintage keyboard almost new
2017-07-04 | ebay.com | $110.25 | Original Apple Mo118 keyboard w. box

In other words, you will have to deal with misspellings, missing model numbers, filtering of irrelevant details, etc.

You can try to be clever about it and write your own rules/heuristics, but you will spend a long time getting not very far. You can ignore this entirely and get messy data as a result, which can be unwieldy at best and misleading or unusable at worst. You can go in manually and fix each entry that the scraper adds, which takes a bit of effort but is reliable and gives you usable structured data. Or you can build a Machine Learning / AI system to do the language parsing better than a rule-based system could do, but this could take quite some time to build and requires fairly specialized skills, and the result will be somewhere between the rule-based and the manual approach, so it might not be a viable option.

My recommendation: let the scraper find the candidates, then do a quick manual revision where you parse out the make, model number, etc. yourself. It'll take manual work, but we're scraping vintage keyboards for sale, not tweets; the volume coming in each day should be relatively manageable.

User avatar
elecplus

04 Jul 2017, 21:41

Where does one find such a scraper? My programming days are about 30 years ago...

User avatar
Scarpia

05 Jul 2017, 21:10

If you want to roll your own, you'll need a scraping code library; here's a list: http://opendir.io/cat/7/web-scraping

For a hosted solution where you can build your scraping configs visually (or more easily at least), but where you don't have as much control over the total solution, there's places like import.io (somewhat pricey) or its alternatives: https://www.quora.com/Who-are-the-compe ... -import-io (ParseHub looks like a popular choice: https://www.parsehub.com/). There's also ScrapingHub, which is another hosted solution where they offer some of the most popular libraries: https://scrapinghub.com/

I'd say take a few of them for a free trial spin and see which tool is the easiest to get up and running, and just go with it.

Post Reply

Return to “Keyboards”