I don't know if this is the right board for this question, but I'm coding this application on a linux platform.
I'm basically new programming, and I'm teaching myself. I have a basic question, which is independent of language, and I know there are some smart people here who could help me out.
I'm building an application that will retrieve two large files from a remote FTP server, parse the files, and update a database with the new data from the files. I'm coding this in Python. My question is this: Which of the two solutions would be the way to go and why? (also, if you know of a better way please let me know).
Solution 1 (memory issues?):
1. Establish a connection with the server.
2. Read the contents of the files into two large data structures
3. Close connection with FTP server.
4. Open connection with SQL database.
5. Iterate through data structures and update database records.
6. Close connection to Database
Solution 2 (device read and write issues?):
1. Establish connection with FTP server
2. Establish connection with SQL server
3. While there are lines in the files, retrieve and parse data line by line from FTP server update corresponding record in the database.
4. Close connections to both servers.
Since i'm coding this is python, you can guess that speed isn't really an issue (it'll will run very early in the morning), but resource consumption is important. I just want to know the least retarded way to handle the problem.
I'm just wondering how you handle large amounts of data, and I don't really understand what bottle necks would occur with each solution. If anyone has any ideas, please let me know.