Prepare Data

Download the Dataset and Upload to Amazon S3

Step 1: Download the Dataset

  1. Navigate to the MeetAssistData
  2. Download the data to your local machine
  3. Unzip the file, which will expand into a folder called DATA

IMPORTANT - Do NOT open CSV files with Excel:

  • Microsoft Excel can automatically convert data formats, which may corrupt the data (e.g., phone numbers, dates, IDs may be altered)
  • Recommended approach: Upload the data to S3 first (steps below), then use VS Code to view/validate the CSV files instead of Excel
  • If you must view locally: Use VS Code only - it won’t modify your data like Excel does
  • Keep original CSV files intact before uploading to S3

Step 2: Create S3 Bucket

Run this CLI script to create the Amazon S3 bucket. Replace <account-id> with your AWS account ID:

aws s3 mb s3://meetassist-data-<account-id>-ap-northeast-1 --region ap-northeast-1

To find your AWS Account ID, run: aws sts get-caller-identity --query Account --output text

S3 Bucket Creation

Step 3: Upload Data to S3

  1. Go to the AWS S3 Console
  2. Navigate to the bucket you just created: meetassist-data-<account-id>-ap-northeast-1
  3. Create a folder named data
  4. Upload all CSV files from the unzipped DATA folder into the data folder in S3

data folder

upload files

Verification

After uploading, verify that all CSV files are present in your S3 bucket under the data/ prefix:

aws s3 ls s3://meetassist-data-<account-id>-ap-northeast-1/data/

You should see all the CSV files listed in the output.