VSCode is dead! Long Live Vim!

Well, really Long Live NeoVim, but that just isn’t as catchy is it? For more about topics like this or to read the original article, click here.

A quick disclaimer to the emacs folks; I’m not trying to pick a fight here. I just like vim and never really got into using emacs, so I’m writing from my perspective.

Some context

I’ve spent the last year plus using VSCode and generally had a great experience. I started off with default keybindings, until a coworker told me I should try vim keybindings.
After spending a little time learning to use the vim keys, I was editing code quicker and my mind was blown. 🤯

This lasted until I started working on bigger projects and VSCode got kind of slow. I just suffered and kept using it until I saw a YouTube video talking about how to configure vim like VSCode. I got inspired and had to try it for myself. Several hours and a couple of days worth of...

Why I think RL tooling matters

The tools we use influence the research we do, and while there are many good RL tools out there, there are still areas where tools need to be built. For more about topics like this or to read the original article, click here.

The RL Tools Everyone Has

Maybe I’m wrong, but I think every RL researcher has some tools they’ve built and that they use across their projects. Since they’ve built them, these tools are the perfect fit for them. But their tools might also be useful for someone else. However, we rarely see code for RL tools get packaged up and open-sourced. I’m guilty of it too. I’ve got the same NN code that I copy across projects, and my method of reuse of some core utility functions is criminal. Suffice to say, any software engineer would cringe at my process. But it works for me. I run experiments, try new things, and get results.

The thing that sucks, though, is when I want...

Beginner friendly reinforcement learning with rlpack

Trained PPO agent playing LunarLander

Lately, I’ve been working on learning more about deep reinforcement learning and decided to start writing my own RL framework as a way to get really familiar with some of the algorithms. In the process, I also thought it could be cool to make my framework a resource for beginners to easily get started with reinforcement learning. With that in mind, the goal wasn’t state of the art performance. However, in the future, I might decide that a high standard of performance is something I want to prioritize over beginner-friendliness.

This framework isn’t finished, but this post indicates the first version of it that I’m happy with being done. Up until now, it’s been the main project I’ve focused on, but moving forward I’ll be putting more time into exploring different things and will more passively work on expanding rlpack. Find the repository here or...

Aug 6, 2019

Making it easier to play my Tic-Tac-Toe agent

Background

As part of my senior project in undergrad, I made a Tic-tac-toe playing RL agent. It used a simple temporal difference (TD) update rule that can be found in Chapter 1 of Sutton and Barto’s RL book. In fact, all of the details for how to build the agent can be found in Chapter 1 of that book. They do a case study of making a Tic-Tac-Toe playing algorithm and cover everything from what’s needed from the environment, to the update rule for the agent. Definitely worth a read, especially if you want to implement one yourself.

The Update

A friend asked me how he could play my trained agent, so I chose to go ahead and write a simple script to make it easy for anyone (with a tiny bit of terminal knowledge) to play against it. Here’s how to do it:

Head over to my GitHub repository and clone it:

git clone https://github.com/jfpettit/senior-practicum.git

Once you’ve cloned it, go...

Continue reading →

Introducing gym-snake-rl

github_repo

Motivation

Although there are existing implementations of the classic Snake game (play the game here), I wanted to create my own implementation for a few reasons:

Opportunity to learn more about environment design, including designing an observation space and reward signals.
Write an implementation with random map generation, so that this code could be used to work on generalization in RL. See OpenAI’s blog post on this topic.
Create snake as a multi-agent system, and create versions of the environment where there are fewer units of food than there are snakes, so that we can investigate what competitive behavior evolves.
Implement a vectorized observation space for the snake game, in an attempt to require less computational power than games that only provide the screen images as observations. For example, CoinRun, OpenAI’s procedurally generated environment for working on...

Jacob Pettit

Writing about tech and science. Also follow along on my blog: https://jfpettit.github.io

Read this first