Subject Re: complex indexes vs index key fields
From Mervyn Bick <invalid@invalid.invalid>
Date Tue, 28 Jul 2020 14:00:56 +0200
Newsgroups dbase.getting-started
Attachment(s) load_mysql_energydata.prg

On 2020-07-28 11:36, Akshat Kapoor wrote:

> Good Afternoon Mervyn,
> It is 350 records per minute not 35 records

Oops, I need new glasses. :-(

>
> And that even is slow by MYSQL standards.
>

Over 21 minutes for 7577 records is SLOW by any standards. :-)


> The difference could be due to execution 1 query for every row, and this
> indeed slows down speed.

A row at a time only takes 45 seconds for 7577 records in Firebird.
That's not particularly fast but it means that it would take less than a
minute to load a month's records for each site.

> Inserting more than 1 row in the same query will probably be taking the
> same time. If not the same time then an increase of just single digit %.
>
> I did change a code from row by row processing to bulk processing and
> time improvement was considerable. I did not time the old and new ones
> but it was something like 10+ minutes to approx 1 minute.

Firebird does not accept multiple rows in a single INSERT but I've
modified my loading program so that Gaetano can try it.  This program
has NOT been tested as I don't have MySql available.

The program takes less than a second to generate the 8 INSERT statements
requires to deal with the 7577 records.  Executing them should not take
too many seconds.  :-)

The connection string and the path to the .csv file will need to be
edited.  The program assumes that there is an empty file in MySql with
the correct fields.  The primary key needs to be (siteId,eTimestamp).


Mervyn.



clear
s = seconds()
d = new adodatabase()
d.connectionString = "DRIVER=Firebird/InterBase(r) driver;UID=SYSDBA;PWD=masterkey;DBNAME=d:\firebird_databases\energydata.fdb;DIALECT=3"
d.active = true

_app.allowYeildOnMsg = true
f = new file()
f.open('d:\examples\ado\energydata2.csv')
cRead = f.gets(1000) //header
n = 1
do while not f.eof()
   n1 = 1
   cStr = "insert into energydata (siteId,eTimestamp,egen,econs,ac_pv_econs,"
   cStr += "pv_eexp,g_eimport) values "
   do while n1 < 1000 and not f.eof()
     cRead = f.gets(10000)
     cStr += "("+parse_cRead(cRead)+"),"
     _app.executeMessages()
     if n%500 = 0
        ?n
     endif
     n++
     n1++    
   enddo
   cStr = substr(cStr,1,len(cStr)-1) //get rid of last comma
   d.ExecuteSQL(cStr)
enddo
d.ExecuteSQL('commit')
f.close()
d.active = false
? n,' Records added '
?'seconds',(seconds()-s)

function parse_cRead(cRead)
   //Excludes site field from the INSERT. Wraps siteID and eTimestamp in single quotes
   //Adds NULL in empty fields in the .csv file.  
   local cRet
   cRet = "'"+substr(cRead,1,at(',',cRead,1)-1)+"','"
   cTs = substr(cRead,at(',',cRead,2)+1,(at(',',cRead,3)-1)-at(',',cRead,2))  +"',"
   cRet += convert_ts(cTs)
   cRet += substr(cRead,at(',',cRead,3)+1)
   do while ",,"$cRet
      cRet = stuff(cRet,at(',,',cRet),2,",NULL,")
      if substr(cRet,len(cRet),1) = ','
         cRet += 'NULL'
      endif  
   enddo  
   return cRet
  
function convert_ts(cTs)
   //Reformats the timestamp as YYYY-MM-DD HH:MM:SS
   local cRet,cY,cH,cM,cS
   if not 'A'$cTs and not 'P'$cTs
       cRet = cTs
   else
      cY = substr(cTs,1,9)
      cY = stuff(cy,7,0,'-')
      cY = stuff(cy,5,0,'-')
      if 'P'$cTs and val(substr(cTs,10,2)) = 12
        cH = '00'
      elseif 'P'$cTs and val(substr(cTs,10,2)) < 12
        cH = val(substr(cTs,10,2))+12
      else
        cH = substr(cTs,10,2)
      endif        
      cM = substr(cTs,13,2)
      cS = ":00',"  
      cRet = cY+cH+':'+cM+cS
   endif
   return cRet