PLAY PODCASTS
Politico Europe Is a Business to Business News and Data Service

Politico Europe Is a Business to Business News and Data Service

with Nick Janetakis and Karl Roos

Running in Production · Nick Janetakis

July 26, 20211h 12m

Show Notes

In this episode of Running in Production, Karl Roos goes over building a B2B news and data platform with Rails, Node and Python. It’s hosted on AWS using Elastic Beanstalk and has been up and running since 2014.

Karl talks about writing a Rails API back-end, scraping 400+ sites, executing 500k+ daily jobs, using a bunch of AWS resources, what it’s like dealing with a ~500 GB MySQL database, the importance on taking action and more.

Topics Include

  • 1:55 – What type of application we are talking about here?
  • 4:41 – Switching from PHP to a combo of Rails and Node
  • 8:02 – A few useful Ruby gems that were used to help build the app
  • 9:41 – The Vue front-end is for a customer facing dashboard
  • 14:48 – Where D3.js is being used to render charts and the data pipeline
  • 17:43 – Scraping data from 400+ sites and dealing with edge cases
  • 21:38 – The scraper runs on 10-16 EC2 instances through Elastic Beanstalk
  • 26:04 – Each separate service lives in its own git repository and a bit of Serverless
  • 31:06 – Sticking with the latest stable version of Rails and updating dependencies
  • 33:10 – Sprinkles of Python to glue together a few AWS services and translating languages
  • 36:11 – What it’s like using Elastic Beanstalk and executing 500k+ jobs a day
  • 40:04 – Initial AWS credits helped sway the decision to try out AWS initially
  • 43:47 – How CloudFormation and Terraform are being used
  • 48:59 – All devs can push to production, code reviews, linting and the deploy process
  • 54:21 – Dealing with database migrations and a ~400-500 GB data set
  • 1:01:17 – Getting a local dump of the DB in development, seeding data and secrets
  • 1:06:03 – Database backups are done with the built in RDS snapshots
  • 1:10:19 – Best tips? Less thinking, more doing and learn from your experiments
  • 1:11:35 – Karl is on GitHub and Twitter
📄 References
⚙️ Tech Stack