Page 2 of 3 FirstFirst 123 LastLast
Results 11 to 20 of 28

Thread: 24.04 considerably slower than 20.04

  1. #11
    Join Date
    Aug 2009
    Beans
    1,327
    Distro
    Ubuntu Development Release

    Re: 24.04 considerably slower than 20.04

    Generally, developers don't read this forum. If you want developers to see your post, a better place is https://discourse.ubuntu.com/

    But your tests would be more interesting if they used Ubuntu kernels. I don't know if the Ubuntu 24.04 LTS kernel would boot on 20.04 LTS; I don't have any need myself to be trying alternative kernels.

  2. #12
    Join Date
    Feb 2011
    Location
    Coquitlam, B.C. Canada
    Beans
    3,524
    Distro
    Ubuntu Development Release

    Re: 24.04 considerably slower than 20.04

    Quote Originally Posted by jbicha View Post
    Generally, developers don't read this forum. If you want developers to see your post, a better place is https://discourse.ubuntu.com/

    But your tests would be more interesting if they used Ubuntu kernels. I don't know if the Ubuntu 24.04 LTS kernel would boot on 20.04 LTS; I don't have any need myself to be trying alternative kernels.
    Hi Jeremy,

    Thanks for chiming in on this thread.

    I am aware that developers don't read this forum. For now, I am just looking for input/suggestions from my friends herein. I'll look towards escalation at some point, and if I am able to bound the issue a little better. That being said, this one is driving me a bit nuts.
    Any follow-up information on your issue would be appreciated. Please have the courtesy to report back.

  3. #13
    Join Date
    Feb 2011
    Location
    Coquitlam, B.C. Canada
    Beans
    3,524
    Distro
    Ubuntu Development Release

    Re: 24.04 considerably slower than 20.04

    Status update:

    I compared line by line the dmesg outputs booting the same kernel, 6.7.0-060700-generic, on both 20.04 and 22.04 and did not see anything (I could easily have missed something).

    I compared the loaded module list, and do see 4 differences: cfg80211; dmi_sysfs; grtr; uas has 2 references whereas it used to have 1.

    Tried to look for some different default configurations, such as scheduler or whatever. Haven't found anything, but not sure of everywhere to look.

    Since the execution differences appeared to be in system call code, I made a test that just does a bunch of system calls to "times". It did not work, with 24.04 actually running just a little faster than 20.04, 1.2%.

    Continuing...
    Any follow-up information on your issue would be appreciated. Please have the courtesy to report back.

  4. #14
    Join Date
    Mar 2010
    Location
    USA
    Beans
    Hidden!
    Distro
    Ubuntu Development Release

    Re: 24.04 considerably slower than 20.04

    @Doug S ---

    Configuring and sampling kernel stats on 22.04.3 with 6.7.060700 now for that "other"... You know I also have Dev 24.04 on that same hardware. If you send me what you are running for tests with this, I can send you that also...

    "Concurrent coexistence of Windows, Linux and UNIX..." || Ubuntu user # 33563, Linux user # 533637
    Sticky: Graphics Resolution | UbuntuForums 'system-info' Script | Posting Guidelines | Code Tags

  5. #15
    Join Date
    Feb 2011
    Location
    Coquitlam, B.C. Canada
    Beans
    3,524
    Distro
    Ubuntu Development Release

    Re: 24.04 considerably slower than 20.04

    Quote Originally Posted by MAFoElffen View Post
    @Doug S ---

    Configuring and sampling kernel stats on 22.04.3 with 6.7.060700 now for that "other"... You know I also have Dev 24.04 on that same hardware. If you send me what you are running for tests with this, I can send you that also...
    Hi Mike,

    I would be very grateful if you would try and test on your computer. I am still picking away at things in an attempt to create a relatively simple test. I have achieved a 25% throughput difference between 20.04 and 24.04, but that was with a fairly complicated test involving 11 token passing ring pairs. My current tests involve a much simpler 1 token passing pair using regular pipes or named pipes. For 1 CPU, there is no performance difference. For 2 CPUs, but on the same core, there is about a 9% performance difference. For 2 CPUs on different cores, there is about a 15% performance difference.

    Give me another day or two to try to make some test that others could run.
    Any follow-up information on your issue would be appreciated. Please have the courtesy to report back.

  6. #16
    Join Date
    Feb 2011
    Location
    Coquitlam, B.C. Canada
    Beans
    3,524
    Distro
    Ubuntu Development Release

    Re: 24.04 considerably slower than 20.04

    Previously, I was avoiding running out of CPU capacity. However, If I allow overload, then idle state selection decisions are eliminated as a variable, since no CPU goes idle at all for the duration of the test. Processor package power is about 114 watts for this test, and I had to raise my processor temperature limit before throttling by 5 degrees from 75 to 80 degrees C.

    EDIT:
    Code:
    Ubuntu 20.04.6 LTS
    doug@s19:~$ dpkg -l | grep firm
    ii  amd64-microcode                               3.20191218.1ubuntu1.2                       amd64        Processor microcode firmware for AMD CPUs
    ii  intel-microcode                               3.20230808.0ubuntu0.20.04.1                 amd64        Processor microcode firmware for Intel CPUs
    ii  ipxe-qemu                                     1.0.0+git-20190109.133f4c4-0ubuntu3.2       all          PXE boot firmware - ROM images for qemu
    ii  ipxe-qemu-256k-compat-efi-roms                1.0.0+git-20150424.a25a16d-0ubuntu4         all          PXE boot firmware - Compat EFI ROM images for qemu
    ii  linux-firmware                                1.187.39                                    all          Firmware for Linux kernel drivers
    ii  ovmf                                          0~20191122.bd85bf54-2ubuntu3.4              all          UEFI firmware for 64-bit x86 virtual machines
    
    Ubuntu Noble Numbat (development branch)
    doug@s19:~$ dpkg -l | grep firm
    ii  amd64-microcode                           3.20231019.1ubuntu1                     amd64        Processor microcode firmware for AMD CPUs
    ii  firmware-sof-signed                       2.2.6-1ubuntu4                          all          Intel SOF firmware - signed
    ii  intel-microcode                           3.20231114.1                            amd64        Processor microcode firmware for Intel CPUs
    ii  ipxe-qemu                                 1.21.1+git-20220113.fbbdc3926-0ubuntu1  all          PXE boot firmware - ROM images for qemu
    ii  ipxe-qemu-256k-compat-efi-roms            1.0.0+git-20150424.a25a16d-0ubuntu4     all          PXE boot firmware - Compat EFI ROM images for qemu
    ii  linux-firmware                            20230919.git3672ccab-0ubuntu2.2         amd64        Firmware for Linux kernel drivers
    ii  ovmf                                      2023.11-4                               all          UEFI firmware for 64-bit x86 virtual machines
    While the intel-microcode shows different versions, I know the processor uCode itself is the same for both boots.
    SOF is Sound Open Firmware, so not relevant.

    EDIT 2: I do have a git clone of the git linux-firmware and do update some files when the mainline kernel install complains.

    EDIT3: The /lib/firmware directories. Not sure how to compare the actual files that matter to my system:
    Code:
    doug@s19:/media/nvme/home/doug/idle/perf/results/20-24-compare/firmware$ ls -l
    total 172
    -rw-rw-r-- 1 doug doug  59567 Jan 21 22:52 20-04.txt
    -rw-rw-r-- 1 doug doug 111304 Jan 21 22:55 24-04.txt
    doug@s19:/media/nvme/home/doug/idle/perf/results/20-24-compare/firmware$ head *.txt
    ==> 20-04.txt <==
    /lib/firmware:
    1a98-INTEL-EDK2-2-tplg.bin
    3com
    a300_pfp.fw
    a300_pm4.fw
    acenic
    adaptec
    advansys
    agere_ap_fw.bin
    agere_sta_fw.bin
    
    ==> 24-04.txt <==
    /lib/firmware:
    1a98-INTEL-EDK2-2-tplg.bin.zst
    3com
    acenic
    adaptec
    advansys
    agere_ap_fw.bin.zst
    agere_sta_fw.bin.zst
    amd
    amdgpu
    Attached Images Attached Images
    Last edited by Doug S; January 22nd, 2024 at 12:20 AM.
    Any follow-up information on your issue would be appreciated. Please have the courtesy to report back.

  7. #17
    Join Date
    Mar 2010
    Location
    USA
    Beans
    Hidden!
    Distro
    Ubuntu Development Release

    Re: 24.04 considerably slower than 20.04

    You have mail... (PM's)

    All that is missing is Noble on 6.7.0. It doesn't want to boot at the moment on 'that kernel'. I'll look at that tomorrow.

    I did you the query you asked for, which was every 2 seconds, for about 2 minutes each.

    "Concurrent coexistence of Windows, Linux and UNIX..." || Ubuntu user # 33563, Linux user # 533637
    Sticky: Graphics Resolution | UbuntuForums 'system-info' Script | Posting Guidelines | Code Tags

  8. #18
    Join Date
    Feb 2011
    Location
    Coquitlam, B.C. Canada
    Beans
    3,524
    Distro
    Ubuntu Development Release

    Re: 24.04 considerably slower than 20.04

    On the 20.04 system, I replaced /usr/lib/firmware with my git clone of the current master. The test on 20.04 actually ran a little faster, on average. Conclusion: The performance difference between 20.04 and 24.04 is not due to firmware.

    EDIT: there are approximately an extra 500 interrupts per second for the 24.04 test verses the 20.04 case.

    EDIT 2: 1 100 seconds trace of interuupts while the test was running:

    Code:
    24.04:
        334  tasklet_entry
        334  tasklet_exit
       1204  irq_handler_entry
       1204  irq_handler_exit
       2525  timer_expire_entry
      31787  softirq_entry
      31787  softirq_exit
      31787  softirq_raise
     302037  local_timer_entry
     302136  hrtimer_expire_entry
    
    20.04:
        250  tasklet_entry
        250  tasklet_exit
        600  irq_handler_entry
        600  irq_handler_exit
       2877  timer_expire_entry
      31262  softirq_exit
      31262  softirq_raise
      31263  softirq_entry
     300778  local_timer_entry
     300840  hrtimer_expire_entry
    The irq_handler differences are mostly i915 interrupts (545), that do not occur on 20.04. However they only account for 0.21% of CPU 0 usage. I.E. no smoking gun. I have yet to look at other ISR times.
    Attached Images Attached Images
    Last edited by Doug S; January 23rd, 2024 at 02:10 AM.
    Any follow-up information on your issue would be appreciated. Please have the courtesy to report back.

  9. #19
    Join Date
    Feb 2011
    Location
    Coquitlam, B.C. Canada
    Beans
    3,524
    Distro
    Ubuntu Development Release

    Re: 24.04 considerably slower than 20.04

    @Mike

    Here is what I am currently doing as the test.
    Note: very hacky stuff, don't judge.
    I have modified things to assume the program and script are in the same directory.

    The ping pong c program:

    Code:
    /******************************************************
    /*
    /* pingpong.c Smythies 2022.10.21
    /*      Useing stdin and stdout redirection for this
    /*      program is a problem. The program doesn't start
    /*      execution until there is something in the
    /*      stdin redirected queue, so trying to start
    /*      things via the last flag doesn't work.
    /*      Try treating the incoming and outgoing named
    /*      as files opened herein. This will also allow
    /*      timeout management as a future edit.
    /*
    /* pingpong.c Smythies 2022.10.20
    /*      Use the new "last" flag to also start the
    /*      token passing.
    /*
    /* pingpong.c Smythies 2022.10.19
    /*      If the delay between the last read of the
    /*      first token and the write from the last place
    /*      in the chain of stuff is large enough then the
    /*      first intance of the program might have terminated
    /*      and shutdown the read pipe, resulting in a SIGPIPE
    /*      signal. With no handler it causes the program to
    /*      terminate.
    /*      Add an optional command line parameter to indicate if
    /*      this instance of the program is the last one and
    /*      therefore it should not attempt to pass along the
    /*      last token.
    /*
    /* pingpong.c Smythies 2021.10.26
    /*      Eveything works great as long as the number
    /*      of stops in the token passing ring is small
    /*      enough. However, synchronization issues
    /*      develop if the number of stops gets big enough.
    /*      Introduce a synchorizing step, after which
    /*      there should not be any EOF return codes.
    /*
    /* pingpong.c Smythies 2021.10.24
    /*      Print loop number and error code upon error
    /*      exit. Exit on 1st error. Was 3rd.
    /*
    /* pingpong.c Smythies 2021.10.23
    /*      Change to using CLOCK_MONOTONIC_RAW instead of
    /*      gettimeofday, as it doesn't have any
    /*      adjustments.
    /*      Change to nanoseconds.
    /*
    /* pingpong.c Smythies 2021.07.31
    /*      Add write error check.
    /*
    /* pingpong.c Smythies 2021.07.24
    /*      Exit after a few errors.
    /*
    /* pingpong.c Smythies 2021.07.23
    /*      Add execution time.
    /*
    /* pingpong.c Smythies 2020.12.07
    /*      Add an outter loop counter comnmand line option.
    /*      Make it optional, so as not to break my existing
    /*      scripts.
    /*
    /* pingpong.c Smythies 2020.06.21
    /*      The original code is from Alexander.
    /*      (See: https://marc.info/?l=linux-kernel&m=159137588213540&w=2)
    /*      But, it seems to get out of sync in my application.
    /*      Start this history header.
    /*      I can only think of some error return.
    /*      Add some error checking, I guess.
    /*
    /******************************************************/
    
    #include <sys/time.h>
    #include <sys/stat.h>
    #include <sys/types.h>
    #include <time.h>
    #include <unistd.h>
    #include <stdio.h>
    #include <fcntl.h>
    #include <stdlib.h>
    #include <limits.h>
    #include <errno.h>
    #include <string.h>
    //#include <signal.h>
    //#include <sys/wait.h>
    //#include <linux/unistd.h>
    
    #define MAX_ERRORS 2
    /* Aribitrary */
    #define SYNC_LOOPS 3
    
    unsigned long long stamp(void){
       struct timespec tv;
    
       clock_gettime(CLOCK_MONOTONIC_RAW,&tv);
    
       return (unsigned long long)tv.tv_sec * 1000000000 + tv.tv_nsec;
    } /* endprocedure */
    
    int main(int argc, char **argv){
       unsigned long long tend, tstart;
       long i, j, k, n, m;
       long eof_count = 0;
       int error_count = 0;
       int err, inf, outf, errvalue;
       int last = 0;
       char c = '\n';
       char *infile, *outfile;
    
    //   fprintf(stderr, "begin...\n");
    
       switch(argc){
       case 4:
          infile = argv[1];
          outfile = argv[2];
          n = atol(argv[3]);
          m = LONG_MAX;
          break;
       case 5:
          infile = argv[1];
          outfile = argv[2];
          n = atol(argv[3]);
          m = atol(argv[4]);
          break;
       case 6:
          infile = argv[1];
          outfile = argv[2];
          n = atol(argv[3]);
          m = atol(argv[4]);
          last = atoi(argv[5]);
          break;
       default:
          printf("%s : Useage: pingpong infifo outfifo inner_loop [optional outer_loop [optional last flag]]\n", argv[0]);
          return -1;
       } /* endcase */
    
    //   printf(" infile: %s  ; outfile: %s  ; %d\n", infile, outfile, last);
    
       if(last != 1){  // for all but the last, create the named pipe outfile
          err = mkfifo(outfile, 0666);
          if ((err != 0) && (errno != EEXIST)){ // file already exists is OK
             errvalue = errno;
             printf("Cannot create output fifo file: %s ; %d ; %s\n", outfile, err, strerror(errvalue));
             return -1;
          } /* endif */
       } else {   // for the last we open the write first, read should already be open.
          if ((outf = open(outfile, O_WRONLY))  == -1){
             errvalue = errno;
             printf("Cannot open last output fifo file: %s ; %d ; %s\n", outfile, outf, strerror(errvalue));
             return -1;
          } /* endif */
       } /* endif */
    
       if ((inf = open(infile, O_RDONLY)) == -1){
          errvalue = errno;
          printf("Cannot open input fifo file: %s ; %d ; %s\n", outfile, inf, strerror(errvalue));
          return -1;
       } /* endif */
    
       if(last != 1){  // for all but the last, now we open the write
    //   if ((outf = open(outfile, O_WRONLY | O_NONBLOCK))  == -1){
          if ((outf = open(outfile, O_WRONLY))  == -1){
             errvalue = errno;
             printf("Cannot open not last output fifo file: %s ; %d ; %s\n", outfile, outf, strerror(errvalue));
             return -1;
          } /* endif */
       } /* endif */
    
       if(last == 1){  // the last chain initiates the token passing
    //      usleep(999999);
          err = write(outf, &c, 1);
          if(err != 1){
             fprintf(stderr, "pingpong write error on startup, aborting. %d  %d  %d\n", last, err, outf);
             return -1;
          } /* endif */
       } /* endif */
    
    //   printf("flag 4: inf: %d  ; outf: %d  ; %d \n", inf, outf, last);
    
    /* make sure we are synchronized. EOF (0 return code) can occur until we are */
    
       j = SYNC_LOOPS;
       while(j > 0) {  // for SYNC_LOOP successful loops do:
          err = read(inf, &c, 1);
          if(err == 1){
             j--;        // don't decrement for EOF.
             for (i = n; i; i--){  // we also attempt to sync in time for later T start
                k = i;
                k = k++;
             } /* endfor */
             err = write(outf, &c, 1);
             if(err != 1){ // and then pass along the token along to the next pipeline step.
                fprintf(stderr, "pingpong sync step: write error or timeout to named pipe. (error code: %d ; loops left: %ld ; last: %d)\n", err, j, last);
                return -1;
             } /* endif */
          } else {
             if(err < 0){
                fprintf(stderr, "pingpong sync step: read error or timeout from named pipe. (error code: %d ; loops left: %ld ; last: %d)\n", err, j, last);
                return -1;
             } else {
                eof_count++;  // does the loop counter need to be reset??
             } /* endif */
          } /* endif */
       } /* endwhile */
    
    //   printf(" infile: %s  ; outfile: %s  ; last: %d; eof_count %ld\n", infile, outfile, last, eof_count);
    
    /* now we are synchronized, or so I claim. Get on with the real work. EOF is an error now.*/
    
       j = m;
       tstart = stamp(); /* only start the timer once synchronized */
       while(j > 0) {  // for outer_loop times do:
          err = read(inf, &c, 1);
          if(err == 1){
             for (i = n; i; i--){  // for each token, do a packet of work.
                k = i;
                k = k++;
             } /* endfor */
             err = write(outf, &c, 1);
             if(err != 1){ // and then pass along the token along to the next pipeline step.
                fprintf(stderr, "pingpong write error or timeout to named pipe. (error code: %d ; loops left: %ld ; EOFs: %ld ; last: %d)\n", err, j, eof_count, last);
                error_count++;
                if(error_count >= MAX_ERRORS) return -1;
             } /* endif */
          } else {
             error_count++;
             fprintf(stderr, "pingpong read error or timeout from named pipe. (error code: %d ; loops left: %ld ; EOFs: %ld ; last: %d)\n", err, j, eof_count, last);
             if(error_count >= MAX_ERRORS) return -1;
          } /* endif */
    //      if(j <= 3) fprintf(stderr, "Loop: %ld ; EOFs: %ld\n", j, eof_count);
          j--;
       } /* endwhile */
       tend = stamp();  // the timed portion is done
    
    /* Now we do one token pass to flush. The previous write pipe may have already been terminated, so EOF read response is O.K. */
    
       err = read(inf, &c, 1);
       if(err == 1){
          if(last != 1){  // last in the chain does not pass along the last token
             err = write(outf, &c, 1);
             if(err != 1){ // and then pass along the token along to the next pipeline step.
                fprintf(stderr, "pingpong flush loop: write error or timeout to named pipe. (error code: %d ; EOFs: %ld ; last: %d)\n", err, eof_count, last);
             } /* endif */
          } /* endif */
          } else {
             fprintf(stderr, "pingpong flush loop: read error or timeout from named pipe. (error code: %d ; EOFs: %ld ; last: %d)\n", err, eof_count, last);
          } /* endif */
    
       fprintf(stderr,"%.4f usecs/loop. EOFs: %ld\n",(double)(tend-tstart)/((double) m * 1000.0), eof_count);
       close(outf);
       close(inf);
       return -1;
    //   return 0;
    } /* endprogram */
    The script:

    Code:
    #! /bin/dash
    #
    # ping-pong-many-parallel Smythies 2024.01.23
    #       assume the ping pong program is local.
    #
    # ping-pong-many-parallel Smythies 2022.10.23
    #       update required to reflect changes to program
    #
    # ping-pong-many-parallel Smythies 2022.10.09
    #       Launch parrallel ping-pong pairs.
    
    # because I always forget from last time
    killall pingpong
    
    # If it does not already exist, then create the first named pipe.
    
    COUNTER=0
    POINTER1=0
    POINTER2=1
    while [ $COUNTER -lt $3 ];
    do
       if [ -p /dev/shm/pong$POINTER1 ]
       then
         rm /dev/shm/pong$POINTER1
       fi
       mkfifo /dev/shm/pong$POINTER1
    
       POINTER1=$(($POINTER1+1000))
       POINTER2=$(($POINTER2+1000))
       COUNTER=$(($COUNTER+1))
    done
    
    COUNTER=0
    POINTER1=0
    POINTER2=1
    while [ $COUNTER -lt $3 ];
    do
       ./pingpong /dev/shm/pong$POINTER1 /dev/shm/pong$POINTER2 $1 $2 &
       ./pingpong /dev/shm/pong$POINTER2 /dev/shm/pong$POINTER1 $1 $2 1 &
    
       POINTER1=$(($POINTER1+1000))
       POINTER2=$(($POINTER2+1000))
       COUNTER=$(($COUNTER+1))
    done
    Create some directory and put those two files there. Make the script executable and compile the c program (Note: use the older OS for the compile):

    Code:
    doug@s19:~/idle/self-contained-test$ ls -l
    total 16
    -rw-rw-r-- 1 doug doug 8874 Jan 23 08:03 pingpong.c
    -rwxr-xr-x 1 doug doug  980 Jan 23 08:03 ping-pong-many-parallel
    doug@s19:~/idle/self-contained-test$ cc pingpong.c -o pingpong
    doug@s19:~/idle/self-contained-test$ ls -l
    total 36
    -rwxrwxr-x 1 doug doug 17304 Jan 23 08:04 pingpong
    -rw-rw-r-- 1 doug doug  8874 Jan 23 08:03 pingpong.c
    -rwxr-xr-x 1 doug doug   980 Jan 23 08:03 ping-pong-many-parallel
    This uses a lot of energy and creates a lot of heat while running, so be sure your thermal and power limit throttling protections are working properly. That being said, we want this test to run without any throttling involved so as to not influence the results. This includes number of active cores throttling, so you might have to limit your max CPU frequency to below the number of active cores limit. I normally run with thermal throttling set to 75 degrees, but set it to 80 degrees for this. The system should otherwise be fairly idle for this test. I use 3 terminals: One for test execution; One running "top -d 15", where I can be sure there is no idle time; One running "sudo /home/doug/kernel/linux/tools/power/x86/turbostat/turbostat --quiet --Summary --show Busy%,Bzy_MHz,IRQ,PkgWatt,PkgTmp,RAMWatt,GFXWatt,C orWatt --interval 15", monitoring for power, temperature, and CPU frequency where any throttling will show. The low frequency of the 2 monitoring terminals is to reduce their influence on the test. Note that there will be a little bit of idle as the test finishes as some pairs finish before others and the load reduces. The test needs to run for at least a few minutes to be reduce any influence from startup and wind-down. You might need to adjust the number of pairs to run because you have more CPUs and cores than me. You might need to increase the number of loops because your processors are faster than mine.

    Example test run: I use 20 pairs and 30,000,000 loops and no work per token stop, because we are trying to maximize system time and minimize user time. I also use the performance CPU frequency scaling governor.

    Step 1:
    Code:
    cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
    doug@s19:~/idle/self-contained-test$ grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
    /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:powersave
    /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor:powersave
    /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor:powersave
    /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor:powersave
    /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor:powersave
    /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor:powersave
    /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor:powersave
    /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor:powersave
    /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor:powersave
    /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor:powersave
    /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor:powersave
    /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor:powersave
    doug@s19:~/idle/self-contained-test$ echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
    [sudo] password for doug:
    performance
    doug@s19:~/idle/self-contained-test$ grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
    /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:performance
    /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor:performance
    /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor:performance
    /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor:performance
    /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor:performance
    /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor:performance
    /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor:performance
    /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor:performance
    /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor:performance
    /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor:performance
    /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor:performance
    /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor:performance
    Step 2: Launch the 2 monitoring task in their terminals (not shown, yet) and wait for a couple of reference samples.
    Step 3: Launch the test:
    Code:
    doug@s19:~/idle/self-contained-test$ ./ping-pong-many-parallel 0 30000000 20
    pingpong: no process found   <<<< This is normal
    doug@s19:~/idle/self-contained-test$
    Observe the monitoring terminals: first the top window, for no idle time and mostly system time:
    Code:
    top - 08:42:07 up 16:14,  3 users,  load average: 22.54, 8.42, 3.07
    Tasks: 264 total,  25 running, 239 sleeping,   0 stopped,   0 zombie
    %Cpu0  :  7.7 us, 92.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    %Cpu1  :  7.2 us, 92.8 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    %Cpu2  :  7.1 us, 92.9 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    %Cpu3  :  8.1 us, 91.9 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    %Cpu4  :  7.8 us, 92.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    %Cpu5  :  8.5 us, 91.5 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    %Cpu6  :  7.7 us, 92.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    %Cpu7  :  7.9 us, 92.1 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    %Cpu8  :  7.9 us, 92.1 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    %Cpu9  :  7.8 us, 92.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    %Cpu10 :  8.0 us, 92.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    %Cpu11 :  8.1 us, 91.9 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    MiB Mem :  31927.3 total,  27810.3 free,    382.2 used,   3734.8 buff/cache
    MiB Swap:   2048.0 total,   2048.0 free,      0.0 used.  31076.6 avail Mem
    
        PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
       3622 doug      20   0    2364   1024   1024 S  36.5   0.0   0:35.32 pingpong
       3623 doug      20   0    2364   1024   1024 S  36.4   0.0   0:35.35 pingpong
       3640 doug      20   0    2364   1024   1024 R  34.4   0.0   0:35.40 pingpong
       3641 doug      20   0    2364   1024   1024 S  34.4   0.0   0:35.47 pingpong
       3626 doug      20   0    2364    896    896 R  34.0   0.0   0:35.52 pingpong
       3627 doug      20   0    2364    896    896 R  33.8   0.0   0:35.51 pingpong
       3628 doug      20   0    2364    896    896 R  33.6   0.0   0:33.14 pingpong
       3619 doug      20   0    2364   1024   1024 R  33.6   0.0   0:38.24 pingpong
       3629 doug      20   0    2364   1024   1024 S  33.6   0.0   0:33.18 pingpong
       3618 doug      20   0    2364    896    896 S  33.5   0.0   0:38.10 pingpong
       3614 doug      20   0    2364    896    896 R  33.4   0.0   0:35.99 pingpong
       3615 doug      20   0    2364    896    896 S  33.3   0.0   0:35.93 pingpong
       3653 doug      20   0    2364   1024   1024 R  31.8   0.0   0:33.53 pingpong
       3652 doug      20   0    2364   1024   1024 R  31.6   0.0   0:33.41 pingpong
       3650 doug      20   0    2364   1024   1024 R  31.4   0.0   0:34.81 pingpong
       3651 doug      20   0    2364   1024   1024 R  31.2   0.0   0:34.68 pingpong
       3638 doug      20   0    2364   1024   1024 S  30.6   0.0   0:33.73 pingpong
       3639 doug      20   0    2364   1024   1024 S  30.6   0.0   0:33.71 pingpong
       3644 doug      20   0    2364    896    896 R  30.2   0.0   0:34.66 pingpong
       3645 doug      20   0    2364   1024   1024 R  30.0   0.0   0:34.61 pingpong
       3620 doug      20   0    2364    896    896 R  29.8   0.0   0:38.04 pingpong
       3621 doug      20   0    2364    896    896 R  29.8   0.0   0:38.03 pingpong
       3616 doug      20   0    2364   1024   1024 S  29.0   0.0   0:33.86 pingpong
       3617 doug      20   0    2364   1024   1024 S  29.0   0.0   0:33.79 pingpong
       3637 doug      20   0    2364    896    896 R  28.4   0.0   0:32.42 pingpong
       3636 doug      20   0    2364   1024   1024 R  28.3   0.0   0:32.31 pingpong
       3646 doug      20   0    2364   1024   1024 R  27.4   0.0   0:33.42 pingpong
       3647 doug      20   0    2364    896    896 S  27.4   0.0   0:33.35 pingpong
    ...
    and the turbostat terminal for not throttling and a consistent CPU frequency. This is from after the test
    Note: from our PM's you know to exectute your turbostat binary form whereever it is and to bypass the Ubuntu dependancy wrappr.
    Code:
    doug@s19:~/idle/perf/results/q243$ sudo /home/doug/kernel/linux/tools/power/x86/turbostat/turbostat --quiet --Summary --show Busy%,Bzy_MHz,IRQ,PkgWatt,PkgTmp,RAMWatt,GFXWatt,CorWatt --interval 15
    [sudo] password for doug:
    Busy%   Bzy_MHz IRQ     PkgTmp  PkgWatt CorWatt GFXWatt RAMWatt
    0.05    4669    982     36      1.59    0.93    0.00    1.33
    0.05    4696    845     36      1.58    0.93    0.00    1.33
    0.06    4628    1069    36      1.59    0.93    0.00    1.33
    0.05    4646    881     36      1.57    0.91    0.00    1.33
    46.68   4799    25113   66      52.49   51.83   0.00    1.34
    99.76   4800    51467   67      110.69  110.04  0.00    1.33
    99.76   4800    53242   69      111.13  110.47  0.00    1.33
    99.76   4800    52871   70      111.50  110.85  0.00    1.33
    99.76   4800    54558   72      112.39  111.73  0.00    1.33
    99.76   4800    52502   73      112.64  111.97  0.00    1.33
    99.76   4800    53247   73      112.84  112.18  0.00    1.33
    99.76   4800    53043   73      113.05  112.39  0.00    1.33
    99.76   4800    53467   74      113.21  112.55  0.00    1.33
    99.76   4800    52729   74      113.31  112.65  0.00    1.33
    99.76   4800    53662   73      113.24  112.59  0.00    1.33
    99.76   4800    52669   74      113.34  112.68  0.00    1.33
    99.76   4800    53368   74      112.99  112.32  0.00    1.33
    99.76   4800    53080   74      113.12  112.47  0.00    1.33
    99.73   4800    51977   74      113.12  112.46  0.00    1.33
    92.03   4800    1164504 67      106.09  105.42  0.00    1.33
    9.38    4799    17895   44      18.32   17.65   0.00    1.33
    0.01    4100    375     43      2.03    1.37    0.00    1.33
    0.05    4661    1047    43      2.23    1.57    0.00    1.33
    And, eventually, the test results:
    Code:
    doug@s19:~/idle/self-contained-test$ ./ping-pong-many-parallel 0 30000000 20
    pingpong: no process found
    doug@s19:~/idle/self-contained-test$ 6.9971 usecs/loop. EOFs: 0
    6.9971 usecs/loop. EOFs: 0
    7.0961 usecs/loop. EOFs: 0
    7.0961 usecs/loop. EOFs: 0
    7.2167 usecs/loop. EOFs: 0
    7.2167 usecs/loop. EOFs: 0
    7.3631 usecs/loop. EOFs: 0
    7.3631 usecs/loop. EOFs: 0
    7.4195 usecs/loop. EOFs: 0
    7.4195 usecs/loop. EOFs: 0
    7.4453 usecs/loop. EOFs: 0
    7.4453 usecs/loop. EOFs: 0
    7.4599 usecs/loop. EOFs: 0
    7.4599 usecs/loop. EOFs: 0
    7.4695 usecs/loop. EOFs: 0
    7.4695 usecs/loop. EOFs: 0
    7.4712 usecs/loop. EOFs: 0
    7.4712 usecs/loop. EOFs: 0
    7.5009 usecs/loop. EOFs: 0
    7.5009 usecs/loop. EOFs: 0
    7.5324 usecs/loop. EOFs: 0
    7.5324 usecs/loop. EOFs: 0
    7.6344 usecs/loop. EOFs: 0
    7.6344 usecs/loop. EOFs: 0
    7.6577 usecs/loop. EOFs: 0
    7.6577 usecs/loop. EOFs: 0
    7.6735 usecs/loop. EOFs: 0
    7.6735 usecs/loop. EOFs: 0
    7.6763 usecs/loop. EOFs: 0
    7.6763 usecs/loop. EOFs: 0
    7.7355 usecs/loop. EOFs: 0
    7.7355 usecs/loop. EOFs: 0
    7.7581 usecs/loop. EOFs: 0
    7.7581 usecs/loop. EOFs: 0
    7.8000 usecs/loop. EOFs: 0
    7.8000 usecs/loop. EOFs: 0
    7.8477 usecs/loop. EOFs: 0
    7.8477 usecs/loop. EOFs: 0
    7.8972 usecs/loop. EOFs: 0
    7.8972 usecs/loop. EOFs: 0
    Any follow-up information on your issue would be appreciated. Please have the courtesy to report back.

  10. #20
    Join Date
    Feb 2011
    Location
    Coquitlam, B.C. Canada
    Beans
    3,524
    Distro
    Ubuntu Development Release

    Re: 24.04 considerably slower than 20.04

    I did a test with 40 ping pong pairs. Also, just for completeness, I also tried a copy of the git master firmware as the firmware for 24.04, as a second test, and while it makes a difference I do not know if it is more than possible test repeatability variations. IRQ averaged 3042 per second for all tests.
    Attached Images Attached Images
    Any follow-up information on your issue would be appreciated. Please have the courtesy to report back.

Page 2 of 3 FirstFirst 123 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •