During my senior year at Washington University in St. Louis, I had the immense pleasure of leading the AR/VR team for the WashU Robotics Club. Together, we brought to life WashU Wayfinder - an app designed to guide users around the campus using augmented reality. Big thanks to the rest of the team that made the app possible, especially with no merge conflicts :)
In this article, I talk about how I coded the back end of this app entirely through ChatGPT.
Be sure to check out the app here for more context!
TL;DR? ChatGPT thrives on specific and thorough prompting. If you are using it as a co-pilot for your project, give it enough context and it’ll do your job for you.
Part 1: Laying the Groundwork for Wayfinder
The inception of WashU Wayfinder wasn’t exactly linear. In our previous semester, our team learned the ins and outs of developing in virtual reality, specifically on the Oculus Quest. Our experience in this arena taught us the basics of Unity, particularly in the AR/VR space, which operates as a different workflow than a traditional Unity game since you need to build to an external device like a Quest, Hololens, or iPhone just to test any changes you’ve made. Thus when developing in AR/VR, this previous semester’s experience taught us that it’s best to get your code right the first time to avoid endless debugging where you wait for the code to push to the device before running it each time (ChatGPT made this happen, which you’ll see later on). Starting what would be my last semester at WashU before I graduated, we shifted our focus from VR to AR.
We quickly landed on the idea of building an app to give back to the WashU community and landed on the initial concept of seeing the campus in AR. This prompted the challenge of finding a 3D model of the WashU campus we could use in Unity. Instead of diving headfirst into designing one from scratch in Blender, we opted for using a software called RenderDoc which allowed us to pull essentially a 3D screenshot from Google Maps. From RenderDoc to Blender to Unity to learning and dealing with GIT LFS (storing large files on GitHub), we had our model ready.
Now equipped with some context, we approach our first main ChatGPT contribution to the app.
We knew we wanted to provide a helpful description for each building on campus to place on the model we had, so we turned to ChatGPT for help. First, we listed out every building on campus into a Google Sheets column, “Building_Name”, assigned a division to that building in a second column “Division”, and added any quick important info of that building we though was necessary in a third column “Description”. Next, we used the “GPT For Sheets” tool to generate a 2 sentence description for every building on campus at once using the following prompt:
“I am going to give you a building on the Washington University in St. Louis campus. The format I give it in will be (Building Name, Division, Description). Division will be one of the following categories: Library, Engineering, Business, Arts/Sciences, Design/Visual Arts, Dorm, Law, Special. Description will be a little bit of info on what is in that building. I want you to provide a two sentence description of that building based on the information provided. If the division is a dorm, you should include whether it has modern or traditional style dorms in your output, this info will be provided in the description. Your output should include the name and the division of the school along with the description. If the description states a certain department housed in the building, make sure to include it in your output. Feel free to add more to the two sentences than simply what is provided in the description you are given. The first sentence should start with ([Building] is a [Division] building) but the rest is up to you. The input is:”
The following is an example output
With some prompt engineering and some tweaks to the output, we saved ourselves a lot of typing and research for over 100 buildings. Now, we had our campus 3D model and a CSV of our data, and let ChatGPT go to work building the app.
Part 2: Iterating, Building, and Leveraging ChatGPT:
Now, the building framework. The framework I used to build out the back end of the app was:
1. Prompting: Describe the vision.
2. Iterating: Have ChatGPT work through your roadmap.
3. Expanding: Build off the foundation of your previous sprint, introducing new ideas and features.
Next, we’ll dive into examples of how I applied this, starting with describing the vision.
1. Describe out your whole vision
To start off, I wrote out a one page long prompt for GPT 4 in a Google Doc, describing in depth what we had so far and the vision we had for it. I’ve found that when developing using ChatGPT as a co-pilot, it is often best to work in sprints of roadmaps. Give GPT 4 the entire context of your idea, including down the line features you might add so it can build you an adequate base in the short term in the form of a roadmap of steps to accomplish. As I mentioned, my first prompt was a page long describing the full context:
“I am making an AR app within Unity using AR Foundation. In the app, you can place a prefab model of our college campus onto a plane. Within that prefab, I want to add a gameobject at the 3d position of every building on campus, about 120 total. Within the app UI/canvas, I want to add a search bar that will let you type in a building name and buildings will be suggested as you type. For example, if the building was Danforth Hall, I would want the user to be able to type in "dan" and see the options for Danforth Hall and Danforth Center pop up below the search bar. From there, you would be able to scroll and find the building you want. Once you click on the building you want, I want a red arrow to pop up over the building you clicked. Clicking that building in the search bar or list function would then prompt one of the 120 gameobjects I placed. I imagine those gameobjects would all be "inactive" in the inspector, so they would exist by default within the campus model prefab but none of them would display until called as I’ve described. Within each 120 gameobjects would be two child objects, a red arrow and a text box that displays the building name over the building. Additionally, I want a small text box at the bottom of the UI to display a description of that building. For the building names and descriptions, I have a CSV in which the first column includes every building name. The second column has 2-3 sentence descriptions for each building so as to not take up too much room in the UI. I want you to help me roadmap how I would implement these ideas. Give me a comprehensive roadmap of implementing what I have described.”
The key in this step is to be specific. The more ChatGPT knows, the better it is at writing efficient, scalable code that can be expanded upon later on. From that prompt we got our roadmap:
2. Prompt ChatGPT on each line item in your road map like a for each loop
Now that GPT and I were on the same page with our roadmap, I told ChatGPT: “write the script for part 1 for me” and the output was a CSVParser script, which I immediately tested in Unity to make sure it could read in data in the structure we wanted. Unity provided one error, which I plugged back into ChatGPT, and it rebuilt the script to work perfectly on the second try. For anyone familiar with coding, you’re likely aware how annoying it can be to get your code working. It may take one try, but it also could take dozens of tries and hours of your time. This process took just minutes and without that debugging headache. It even built in debugging warnings without me asking for it!!
The code it gave me was as follows:
Pretty detailed, right? All I had to do was hit copy, and paste it into Unity and we successfully read our CSV as needed. Could I have coded this without ChatGPT? Yes. Definitely. But it would have taken some Google searches, and a lot of unnecessary time spent figuring out the best way to code it. Instead, I could plug and play a working solution in minutes.
Once it worked, I was moving on to the next step in the roadmap and ChatGPT was armed with the context of what the CSVParser script could do. With said context, I first had it generate a functioning search bar script which used the dictionary from the CSVParser script to populate itself. Once the search bar script worked, I had it build out a script that determined which building on the model to show to the user, based on whatever is clicked in the search bar. In short, ChatGPT built on the building blocks of its previous scripts to create a working system:
Read in the data → show the list of building options in the search bar → display the UI in the 3D AR space for what the building selected in the search bar.
This is the power of context and building one block at a time. ChatGPT knew what the other scripts could do and built each new script with that in mind, pulling what it needed to from the other scripts.
Unfortunately, this process wasn’t perfect at each turn, as I sometimes had to remind GPT what the other scripts had and the exact feature we wanted to add. Nonetheless, the process was far, far faster than having to code it all out via the traditional process of trying code, getting an error, Googling the issue, writing print statements and eventually getting that Eureka moment. In our case we had to build to our iPhones for each test which would have made the process even longer without the co-pilot.
3. Move on to the next sprint with your foundation intact
After finishing this roadmap, the core functionality of the app was working. However, along the way, we came up with more ideas of what could be added that weren’t in our first brainstorm. For example, I wanted to add a feature so that when a new building was selected, an arrow would spawn in at the first building, and move to the second building so that a user could see where to draw their attention to next instead of trying to find the building in the AR view. At this point, the app was nearly fully built and the core scripts we had were packed with lines of working code so it wasn’t as simple as just asking for the new feature since ChatGPT couldn’t remember each exact up-to-date script from way back in our previous prompts since there were so many. I’ve found that in cases like this where your project gets complicated, it's best to point GPT in the exact direction you need it to look so in my next prompt, I pasted in the scripts I knew the change would affect along with the feature request. Once GPT learned what the up-to-date scripts were, it knew what needed to be done, since it’s great at problem solving with what is directly in front of it. The fix was as simple as reminding it of the previous context that you are referring to, instead of relying on it to find said previous context in a massive thread of prompts.
In my reminder, I gave it a few of my 100 line C# scripts I knew the change might affect, and ChatGPT found the exact spots to update within the code, completing the last feature.
Limitations and Summary
Though this framework had its complications, like providing ChatGPT with reminders and pointers to avoid ‘hallucinations’, it still proved extremely helpful in the development of our app’s back end. On the front end side of things, GPT provided useful instructions on how to build UI in Unity, but we didn’t use it for design and user experience quite as much.
Lastly, on a final note, try to think of other aspects in your life you could automate via ChatGPT. Simply speak your grocery list using ChatGPT Whisper in the mobile app, and generate a succinct list without typing out each line item. Detail out an Excel problem and get a working formula in seconds rather than stressing on each if statement and its parentheses. Plan a detailed travel itinerary down to the minute or just get a rough trip outline you can expand upon if you prefer. While GPT 3.5 can do everything great, consider unlocking GPT 4 and seeing what it can do.
If you’ve made it this far, hit follow to be the first to hear about my latest projects!