Scaling Up with R and Arrow Book

In a first for this blog, I’m starting a running tag with worthy reads that might interest other technically-minded individuals.
Worth Reading
R Stats
Open Source
Author

Andrew Lis

Published

August 14, 2024

I was chatting on the Vancouver Real Estate Podcast the other day, and at the end of their show, they always ask their guests a series of lighthearted questions.

It’s a segment I quite enjoy because you get to know different sides of their guests, and I think that can be helpful when talking about a polarizing subject such as real estate.

But, one of the questions is “what book(s) you’ve been reading lately and whether you have a recommendation”, which is a question that usually has me a little stumped.

In previous episodes of the show that I was on, I semi-jokingly said that I don’t really read many books - I’m more of a technical document kind of guy - which is why I get stumped when someone asks me about a book I might recommend.

It’s not that I’ve never read fiction or anything. On occasion, I dabble. And I’ve read most of the classic stuff.

But when I decide to invest time sitting down to carefully pore over words on a page that somebody (at some point) spent a lot of time jotting down, I tend to prefer learning something new that I can apply immediately to the various things I do in my waking hours.

Anyway, all this to say, that it occurred to me that I do read a lot of technical information, but I rarely take the time to jot down (or share) which things I read were truly worth reading.

So, in a first for this blog, I’m going to start a tag called “Worth Reading” where I’ll occasionally link to something I read, which as the tag implies, was “worth reading”.

To kick it off, here’s a (short) book on making good use of Apache Arrow with R:

Scaling Up with R and Arrow

The Coles Notes are basically that the Arrow (and Parquet) data format is genius, super fast/memory efficient, and allows you to work with “big” data without necessarily having to resort to distributed computing.

And it works very nicely with R and the tidyverse/dplyr syntax to boot.

If that sounds like your jam, it’s definitely… worth reading.