Collecting Beta Testers and Cleaning TestFlight CSVs
Originally published on my website
My friend Andrew Zheng recently started accepting beta testers for his new app via Google Forms. Forms comes with Google Sheets support built-in, so it’s easy to get set up. However, a couple problems quickly surfaced.
First up, Andrew’s form was set up with two fields: name and email.
Name | Email
Andrew Zheng | `firstname.lastname@example.org`
Joe Smith | `email@example.com`
TestFlight requires a CSV with three columns: first name, last name, and email.
First Name | Last Name | Email
Andrew | Zheng | `firstname.lastname@example.org`
Joe | Smith | `email@example.com`
The data needs to be converted first.
Parsing the Responses
To convert the two-column data to the three-column CSV, we first created a new sheet. We placed headers at the top for our own reference, First Name, Last Name, and Email, then created a formula to extract each part from the first sheet. The first sheet, named “Testers”, contained all the responses from the form. To split the name, we started by using the
SEARCH function, the Google Sheet/Excel equivalent of
index(where:), to check if there was a space in the name that we could use to split:
SEARCH(" ", Testers!$B2). If there was, the formula used the
LEFT function to create a substring up to that space:
LEFT(Testers!$B2, SEARCH(" ", Testers!$B2)). If not, it returned the original content of the cell. The full formula for the first name column is as follows.
=IF(ISNUMBER(SEARCH(" ", Testers!$B2)), LEFT(Testers!$B2, SEARCH(" ", Testers!$B2)), Testers!$B2)
The last name column has a similar formula, with two differences. The first is that the
RIGHT function is used instead of
RIGHT returns a substring starting from the end of the string. Because the
RIGHT function is used, the
SEARCH call inside it has to be subtracted from the total length of the string:
LEN(Testers!$B2) - SEARCH(" ", Testers!$B2). The full formula for the last name column is below.
=IF(ISNUMBER(SEARCH(" ", Testers!$B2)), RIGHT(Testers!$B2, LEN(Testers!$B2) - SEARCH(" ", Testers!$B2)), Testers!$B2)
The email formula is very simple, just a reference to the cell in the responses sheet:
=Testers!$C2. The three formulas are copied to all the rows in the sheet.
Cleaning the Responses
The next step was to upload it to TestFlight, but TestFlight threw an error with the most helpful message ever: “An error has occurred. Try again later.” Apple, being its usual helpful self, opted to not provide any useful information that could help fix the errors with the CSV. After much trial and error, my friend was able to successfully import his CSV. But the amount of work it took to hunt down errors in a CSV led me to create TestFlight Cleaner. TestFlight Cleaner cleans your tester CSVs for you, and optionally shows why it’s removing some entries.
readAsText function. Since my website uses Next.js (which uses React), I used a
useState hook to store the
File object that the file input would provide, then I accessed that in a function. The function converts the file to text using the aforementioned function, then begins to process the CSV.
The first step is to turn the raw data into a two-dimensional array. The CSV text was split using the newline character
\n, then each row was split using a comma as a delimiter. This output was stored in another
The next step is triggered via a
useEffect hook monitoring the
useState hook, which is set by the
processCsv function in the previous step. This step is the error-checking step. The first check is a column count. If it detects more or less than three columns, the step fails, and the user is prompted to fix the CSV. The other error is a malformed email check, which checks if an email contains an
@ sign and matches a comprehensive regular expression. If this check fails, it lets the user know and continues, because it can bypass these rows. Finally, another
useState hook is used to inform another
useEffect hook that this step is complete.
The third step is to clean the CSV using a reducer. There are two settings the user can apply that affect this function: specifying the first row as the header row, and leaving malformed/duplicated rows in the preview.
The first part of this step is to check if the email has appeared in any preceding rows. If it has and the leave duplicated rows setting is active, it outputs a “duplicate” flag, and adds it to the array. If it has and the setting is inactive, it returns the existing array.
Checking for Malformed Emails
The next part is to check for malformed emails, achieved by performing a similar check as step two. If the leave malformed rows setting is active, it outputs a “malformed” flag, then adds it to the array. If neither of these parts are triggered, the reducer does the regular invalid character cleaning.
Cleaning Up Invalid Characters
The cleaning process strips all characters that aren’t alphanumeric, periods, dashes, spaces or non-English characters from the first and last names. The email gets line break characters stripped. Finally, the function sets a
useState hook with the cleaned data, and another
useState hook with any duplicated emails. The duplicated emails hook is set only if the leave duplicated rows setting is active.
The final step is to make the preview. It loops over the two-dimensional cleaned CSV data to create a table, and if a malformed or duplicate flag is set, it places a colour on the email. Red signifies malformed and yellow signifies duplicated.
After a user requests an export using the button below the preview, a
Blob is created by iterating over the cleaned rows to form it back into a CSV. An anchor element is created, with its URL set to the output of
URL.createObjectURL. A download attribute is also set, which sets the filename to "TestFlight Testers - Cleaned.csv". This element is then added to the DOM, clicked, then removed. The object URL is also revoked.
I hope that this tool comes in handy for you. If it does or you have any feedback, please contact me on Twitter or Mastodon. Thank you for reading!