Below are common interview questions and sample answers related to data import and export in Python.
Questions:
- What are some common file formats used for data import/export in Python? Answer: The most common file formats include CSV, JSON, XML, Excel, and SQL databases. Python libraries like
pandas
,json
,xml.etree.ElementTree
, andsqlite3
make it easier to work with these formats.
- How can you import data from a CSV file in Python? Answer: You can use Python’s built-in
csv
module or thepandas
library to import data from a CSV file. With pandas, it’s as simple asdf = pd.read_csv("file.csv")
.
- How do you export a DataFrame to an Excel file using pandas? Answer: You can use the
to_excel()
function in pandas. For example,df.to_excel("output.xlsx")
will save the DataFramedf
to an Excel file namedoutput.xlsx
.
- What is the JSON module and how do you import/export JSON data in Python? Answer: The
json
module in Python allows you to parse JSON files into Python objects and vice versa. You can usejson.loads()
to read JSON data andjson.dumps()
to write JSON data.
- Explain how you would connect to an SQL database in Python. Answer: Python has several libraries like
sqlite3
,psycopg2
, orsqlalchemy
to connect to SQL databases. You usually establish a connection using a connection string and then use SQL queries to interact with the database.
- How would you read an XML file in Python? Answer: You can use Python’s built-in
ElementTree
module to parse XML files. This module provides a way to manipulate and traverse the XML structure.
- How do you handle missing or corrupted data during import? Answer: In pandas, you can use parameters like
na_values
anderror_bad_lines
to handle missing or corrupted data during the import process.
- What is the role of the
os
module in data import/export? Answer: Theos
module allows you to interact with the operating system. This is useful for tasks like navigating to the directory where the data files are stored, checking if a file exists, or creating new directories to store exported data.
- How would you import data from an API into Python? Answer: One common way is to use the
requests
library to make an HTTP request to the API endpoint, and then parse the returned JSON data into a Python object.
- How can you schedule regular data imports/exports in Python? Answer: You can use task scheduling libraries like
schedule
or use cron jobs in Unix-like systems to run your Python scripts at specified times for regular data import/export.