Skip to main content

Initial Preparation

Site: STRATIO Training & Certification
Course: Práctica Generative AI Data Processing (14.6)
Book: Initial Preparation
Printed by: Invitado
Date: Tuesday, 9 June 2026, 1:08 AM

1. Pasos previos en rocket

Preliminary Steps in Rocket

Within Rocket, we will create the project that will allow us to complete all the practical exercises.

First, we will create the Rocket project and upload the CSV files to a working folder in HDFS.

  1. Access Rocket and create a new project named "DataProcessing_username" (followed by your username), if you do not already have one.
  2. Access the Rocket environment

  3. Once the project is created, log in to FileBrowser and navigate from your current directory to the global directory.
  4. From there, navigate to the following path:

    /students - here you will create the directory for your practice (i.e., /foldername)

  5. Inside your newly created home directory, upload the CSV files found in the provided Material:
Note: Remember the path where you upload the CSV files, as you will need it in other sections.

2. Creación de tablas en el virtualizador

Creating Tables in the Virtualizer

Now we will create the necessary tables in the catalog (virtualizer), pointing to the files previously uploaded to HDFS:

In the Rocket project we created earlier, go to the catalog:

Then, create a new query to execute the following three statements:


CREATE TABLE german_credit_data_[unique_id] 
(ID INT, Age INT, Sex STRING, Job INT, Housing STRING, Saving_accounts STRING, 
Checking_account STRING, Credit_amount INT, Duration INT, Purpose STRING, Risk STRING) 
USING csv OPTIONS ( header 'true', inferSchema 'true', 
path '/certification/governance/students/[your_folder_name]/german_credit_data.csv' )


CREATE TABLE client_credit_requests_today_[unique_id] (ID INT, Age INT, Job INT, Credit_amount INT,
 Duration INT, Purpose_car INT, Purpose_domestic_appliances INT, Purpose_education INT, 
 Purpose_furniture_equipment INT, Purpose_radio_TV INT, Purpose_repairs INT, 
 Purpose_vacation_others INT, Sex_male INT, Housing_own INT, Housing_rent INT, 
 Savings_moderate INT, Savings_no_inf INT, Savings_quite_rich INT, Savings_rich INT, 
 Risk_bad INT, Check_moderate INT, Check_no_inf INT, Check_rich INT, 
 Age_cat_Young INT, Age_cat_Adult INT, Age_cat_Senior INT) 
 USING csv OPTIONS ( header 'true', inferSchema 'true', 
 path '/certification/governance/students/[your_folder_name]/clients_today.csv' )


CREATE TABLE client_external_info_[unique_id] (ID INT, LegalCase INT, FraudSuspicion INT, 
PoliceReport INT, ContactAudit INT, UkvCheck INT, AddressFraudCheck INT) 
USING csv OPTIONS ( header 'true', inferSchema 'true', 
path '/certification/governance/students/[your_folder_name]/external_list.csv' )

Now, return to the catalog and verify that your tables have been created.