A Draft Guide to Vibecoding Data Visualizations and Analysis
By David Gros. . Version 0.1.0 This post is currently a draft with only "part 1" written up A few years ago doing data analysis and visualization took either settling for the constraints of tools like Excel, or learning complex programming tools, where even experts frequently Googled details of how to use them. Now AI has made it dramatically easier to use the same tools as experts. For small to medium datasets or problems, AI programming ("vibecoding") is likely one of the best ways to do analysis. This is a rough draft guide for exploring vibecoding for data visualization (data viz). I started collecting notes after I helped lead a small workshop on this topic. Currently this post is a v0.1 version where I have written some of the points about choosing and installing a vibecoding tool, as well as a sketch of tips when using the tool. However, it is missing the key parts on actually using said tool. The talk I gave included working through a dataset. This is a critical part of a final guide, but needs to be integrated in. Vibecoding is a term for programming with AI growing in popularity. We forget the code exists and just ✨vibe✨. """ There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good ... I "Accept All" always, I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment, usually that fixes it. The code grows beyond my usual comprehension, I'd have to really read through it for a while. Sometimes the LLMs can't fix a bug so I just work around it or ask for random changes until it goes away. It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works. """ Andrej Karpathy on X. Feb 2, 2025. In this guide we are going to be using Cursor, a popular tool for Vibecoding. There are a lot of tools for vibecoding. It can be helpful to organize them in a few categories: (1) General Chat Tools: Using ChatGPT (or Claude etc) you can ask it to analyze or visualize This gets clunky when doing something complex multiple times. You have to keep uploading your data to multiple chats, reexplaining your problem, and copying files. Features like Projects or Memories can help, but fall short of alternatives here. Still, it can be worth starting here when you know the task is straight forward and you won't be returning to the problem. (2) Integrated Development Environments (IDE): Tools that integrate into the tools professional software engineers use. This is where Cursor is, but also tools like VS Code Copilot or Jetbrains Junie. (3) Command Line Interfaces (CLI): Tools that you can use through a command line terminal (eg, Claude Code, Codex CLI, etc). These focus chatting about what you want, without deep integration with a code editor coming secondary. They edit and run code on your computer. These work well. If you already pay for something like Claude, using Claude Code can be good. If you already pay for ChatGPT, using Copilot can be good. (4) Cloud Agents: Tools that integrate into existing tools software developers use on the cloud not your local computer (eg, Claude Code Web, Codex Web, etc). These are increasingly good for lots of kinds of software development. However, for data visualization or analysis, a quicker loop with files on your computer can be much easier. (5) Vibe-first Web Tools: Tools like Lovable or Replit are designed around make websites or apps. However, for the kinds of data visualization this guide focuses, making a website or full app is not usually needed. This guide chooses to focus on Cursor because: However, realistically most of the tools will work. I wouldn't stress over this decision. Choosing any in category #2 or #3, have low cost to switch. The category #4 and #5 that work in the cloud can take more steps to get out of a given provider cloud and into another platform relative to when the files are contained on your computer Installing https://cursor.com/download is pretty straight forward. There are extra steps on installing Python and Git. In the workshop it took around 45 minutes to get to everyone setup, but was very doable. I put together a starter template found online for the talk. In a future iteration I want to build this out more and explain how to use this. This section needs to be added to the guide. In the talk it was done interactively, but I can pull out nice screenshots. In this part of the talk we worked through analyzing this dataset of US baby names from 1890-2024. It's a very cool dataset, that I think people at the workshop had fun with. It's also an example of data that would be a real pain to do in something like Excel, as every year is its own file. However, point Cursor at the folder, and it has no problem figuring out how to process. A future iteration of this guide post needs to go through this. I also curated out a few other datasets that participants could play with, that needs to be put together in this guide. Here is a sketch of some of tips for vibecoding data viz. Probably not. The Ais are smart. If you don't know what is going on, ask to explain saying "I'm completely unfamiliar with X. Please explain." You can keep track of what you are writing up or your research questions in a file, and then just give it to the model as context. "eg, I'm working on @report.md and now trying to..." There are a few things to know about: The LLMs are trained to do hard coding tasks, and penalized when there are error messages or crashes. Thus, they tend to just make the errors silently go away. Eg, you might ask for a scatter plot of some data, but then half the data points have missing values. The LLM might just ignore these without you realizing it. You might need to dig deeper for any inconsistencies to explore. Without prompting the models will not necessarily cross compare results. This draft guide starts with setting up a vibe coding tool for data visualizing. Further iteration is needed. It is an interesting topic, and I hope there is further work in helping people understand possibilities here.What is Vibecoding?
Setup
Why Cursor?
Installing
Getting to A First Analysis
Touring Cursor Interface
Exploring a Sample Dataset
Keeping the Good Vibes
Will I destroy my computer?
You can ask when you are confused
Context is good!
Look for opportunities to start new chats
Use the reset function
Ask the model not write any code
Know the models like to make things seem good
If things don't seem right, they might not be
Conclusion
