How to convert the data frame values and its headers into a list with python?

I am trying to convert one data frame into a list and I want to keep the header names but I am unable to achieve this.

I am downloading the data from a SQL database and then I convert this data into a data frame:

  • how to run a parallel query on sql server 2008?
  • Selecting path (multiple records) where an ID number (fk) in a table references another ID number (pk) in same table
  • Adding Row Numbers To a SELECT Query Result in SQL Server Without use Row_Number() function
  • Why is Entity Framework passing a parameter as DECIMAL(5,0) for a column defined with NUMERIC(19)?
  • Count number of records returned by group by
  • Unable to connect to msSQL database via PHP
  • import pypyodbc
    from datetime import datetime
    initial_date = datetime(2017,1,1,00,00,00)
    end_date     = datetime(2017,6,1,00,00,00)
    sql_connection = pypyodbc.connect(driver="{SQL Server}", server="", uid="you-dont-know-me",
                                           pwd="guess...", Trusted_Connection="No")
    #execute the SP to retrieve data
    retrieve_database_values = "[DEV].[SP].[QA_ExportV2] @start_date='{start_date:%Y-%m-%d}', " \
                         "@end_date='{end_date:%Y-%m-%d'}".format(start_date=initial_date, end_date=end_date)
    df = pd.read_sql_query(retrieve_database_values, sql_connection)

    Then the way I convert this data frame into lists is with the following code:


    Which gives me the results:

    [[100008115, 'CAS.Santa', 'CAS.Santa-2', 'Yes', 'Transferred', Timestamp('2017-03-11 08:15:00'), ...],
    [100008116, 'Springfield', 'Springfield:H3', 'Yes','Traffic Variation', Timestamp('2017-09-11 00:00:00'), ...],

    However, I want to be able to retrieve the data values and the header names of the data frame, something like this:

    [['id', 100008115, 'site','CAS.Santa', 'site name','CAS.Santa-2', 'new','Yes', 'status','Transferred', 'initial date' ,Timestamp('2017-03-11 08:15:00'), ...],
    ['id',100008116, 'site','Springfield', 'site name','Springfield:H3', 'new','Yes', 'status','Traffic Variation', 'initial date' ,Timestamp('2017-09-11 00:00:00'), ...],

    or if possible something like this:

    [[('id', 100008115), ('site','CAS.Santa'), ('site name','CAS.Santa-2'), ('new','Yes'), ('status','Transferred'), ('initial date' ,Timestamp('2017-03-11 08:15:00')), (...)],
    [('id',100008116), ('site','Springfield'), ('site name','Springfield:H3'), ('new','Yes'), ('status','Traffic Variation'), ('initial date' ,Timestamp('2017-09-11 00:00:00')), (...)],

  • Pandas SQL chunksize
  • pymssql to pandas encoding
  • Need help in inserting Pandas Dataframe into Sql-server DB in python
  • Loop through records and update sql table if exists using Pyodbc or pandas
  • pandas + pyodbc ODBC SQL type -150 is not yet supported
  • DateTime import using Pandas/SQLAlchemy
  • One Solution collect form web for “How to convert the data frame values and its headers into a list with python?”

    One of the options of DataFrame.to_dict() should work.

    import pandas as pd
    df = pd.DataFrame({'a':[1, 2, 3], 'b':[2, 3, 3]})
    >>> df
       a  b
    0  1  2
    1  2  3
    2  3  3
    >>> df.to_dict('records')
    [{'a': 1, 'b': 2}, {'a': 2, 'b': 3}, {'a': 3, 'b': 3}]
    >>> result = df.to_dict('records')
    >>> for thing in result:
    ...     print(list(thing.items()))
    [('a', 1), ('b', 2)]
    [('a', 2), ('b', 3)]
    [('a', 3), ('b', 3)]

    Data frame is just an intermediate step to achieve my desired result.

    Seems like you could get your result directly from the output of the stored procedure. I have no way to test this but looking through the pypyodbc wiki
    I came up with this alternative to the DataFrame…

    Create a cursor from your connection object

    cursor = sql_connection.cursor()

    Reformat the stored procedure call and execute it

    sp = "{{CALL {}}}".format(retrieve_database_values)

    Then from method three of the the nice Hello World script

    query_results = [dict(zip([column[0] for column in cursor.description], row)) for row in cursor.fetchall()]

    query_results should be a list of dicts like result from my DataFrame.to_dict() solution.

    If I am reading that comprehension correctly, zip produces tuples so I think what you want is

    query_results = [list(zip([column[0] for column in cursor.description], row)) for row in cursor.fetchall()]
    query_results = []
    for row in cursor.fetchall():
        column_names = [column[0] for column in cursor.description]
        query_results.append(list(zip(column_names, row)))

    I imagine that could be refined.

    MS SQL Server is a Microsoft SQL Database product, include sql server standard, sql server management studio, sql server express and so on.