How to convert the data frame values and its headers into a list with python?

I am trying to convert one data frame into a list and I want to keep the header names but I am unable to achieve this.

I am downloading the data from a SQL database and then I convert this data into a data frame:

  • Inner select on one to many
  • Populate “Lookup Table” with random values
  • Connecting Android to SQL Server using JDBC
  • SQL query to replace | in address field to front
  • Select N random records in SQL Server without repetition
  • Inserting into SQL Server using MS Access
  • import pypyodbc
    from datetime import datetime
    
    initial_date = datetime(2017,1,1,00,00,00)
    end_date     = datetime(2017,6,1,00,00,00)
    
    sql_connection = pypyodbc.connect(driver="{SQL Server}", server="xxx.xxx.xxx.xxx", uid="you-dont-know-me",
                                           pwd="guess...", Trusted_Connection="No")
    #execute the SP to retrieve data
    retrieve_database_values = "[DEV].[SP].[QA_ExportV2] @start_date='{start_date:%Y-%m-%d}', " \
                         "@end_date='{end_date:%Y-%m-%d'}".format(start_date=initial_date, end_date=end_date)
    
    df = pd.read_sql_query(retrieve_database_values, sql_connection)
    

    Then the way I convert this data frame into lists is with the following code:

    df.values.tolist()
    

    Which gives me the results:

    [[100008115, 'CAS.Santa', 'CAS.Santa-2', 'Yes', 'Transferred', Timestamp('2017-03-11 08:15:00'), ...],
    [100008116, 'Springfield', 'Springfield:H3', 'Yes','Traffic Variation', Timestamp('2017-09-11 00:00:00'), ...],
    [...],[...]]
    

    However, I want to be able to retrieve the data values and the header names of the data frame, something like this:

    [['id', 100008115, 'site','CAS.Santa', 'site name','CAS.Santa-2', 'new','Yes', 'status','Transferred', 'initial date' ,Timestamp('2017-03-11 08:15:00'), ...],
    ['id',100008116, 'site','Springfield', 'site name','Springfield:H3', 'new','Yes', 'status','Traffic Variation', 'initial date' ,Timestamp('2017-09-11 00:00:00'), ...],
    [...],[...]]
    

    or if possible something like this:

    [[('id', 100008115), ('site','CAS.Santa'), ('site name','CAS.Santa-2'), ('new','Yes'), ('status','Transferred'), ('initial date' ,Timestamp('2017-03-11 08:15:00')), (...)],
    [('id',100008116), ('site','Springfield'), ('site name','Springfield:H3'), ('new','Yes'), ('status','Traffic Variation'), ('initial date' ,Timestamp('2017-09-11 00:00:00')), (...)],
    [...],[...]]
    

  • pandas + pyodbc ODBC SQL type -150 is not yet supported
  • Pandas IO SQL and stored procedure with multiple result sets
  • Pandas SQL chunksize
  • Database engine fails to connect to a sql-server instance while trying to insert using to_sql function
  • Error when trying to create new database table in SQL Server 2016 from csv file while using python 3.5 with pandas and sqlalchemy
  • pyodbc.connect() works, but not sqlalchemy.create_engine().connect()
  • One Solution collect form web for “How to convert the data frame values and its headers into a list with python?”

    One of the options of DataFrame.to_dict() should work.

    import pandas as pd
    df = pd.DataFrame({'a':[1, 2, 3], 'b':[2, 3, 3]})
    
    >>> df
       a  b
    0  1  2
    1  2  3
    2  3  3
    >>>
    >>> df.to_dict('records')
    [{'a': 1, 'b': 2}, {'a': 2, 'b': 3}, {'a': 3, 'b': 3}]
    
    >>> result = df.to_dict('records')
    >>> for thing in result:
    ...     print(list(thing.items()))
    
    
    [('a', 1), ('b', 2)]
    [('a', 2), ('b', 3)]
    [('a', 3), ('b', 3)]
    >>>
    

    Data frame is just an intermediate step to achieve my desired result.

    Seems like you could get your result directly from the output of the stored procedure. I have no way to test this but looking through the pypyodbc wiki
    I came up with this alternative to the DataFrame…

    Create a cursor from your connection object

    cursor = sql_connection.cursor()
    

    Reformat the stored procedure call and execute it

    sp = "{{CALL {}}}".format(retrieve_database_values)
    cursor.execute(sp)
    

    Then from method three of the the nice Hello World script

    query_results = [dict(zip([column[0] for column in cursor.description], row)) for row in cursor.fetchall()]
    

    query_results should be a list of dicts like result from my DataFrame.to_dict() solution.

    If I am reading that comprehension correctly, zip produces tuples so I think what you want is

    query_results = [list(zip([column[0] for column in cursor.description], row)) for row in cursor.fetchall()]
    #OR
    query_results = []
    for row in cursor.fetchall():
        column_names = [column[0] for column in cursor.description]
        query_results.append(list(zip(column_names, row)))
    

    I imagine that could be refined.

    MS SQL Server is a Microsoft SQL Database product, include sql server standard, sql server management studio, sql server express and so on.