Thread: cblas_dgemm works on workstation, undefined symbol: dgemm_ on virtual machine

  1. #1
    Registered User
    Join Date
    Apr 2011
    Posts
    34

    cblas_dgemm works on workstation, undefined symbol: dgemm_ on virtual machine

    NOTE: I have also asked this question on stack overflow here but I'm not optimistic that community has the expertise to answer it.

    I'm trying to invoke a routine in a shared library from within a PHP script. This totally works on my Ubuntu 20.04 workstation running PHP 8.2, but croaks on two different virtual machines, even when I recompile CLBAS on that specific machine.
    * Ubuntu 20.04, php 7.4
    * Ubuntu 22.04, php 8.1

    The error on these two vms is this:
    Code:
    php: symbol lookup error: /home/sneakyimp/cblas/CBLAS/lib/cblas_LINUX.so: undefined symbol: dgemm_
    .

    In broad strokes, I've got a PHP script that uses FFI to load a simplified cblas.h file and this creates a proxy object I can use in PHP to call functions in the shared CBLAS library. Because this works on my workstation, I'm inclined to think the problem arises once the function invocation is passed to the shared lib, cblas_LINUX.so, which is specified in the cblas.h file.

    Can someone help me understand what this error means and the mechanism by which functions in a shared library get invoked in this situation?

    I have been compiling BLAS/CBLAS as a shared library according to these instructions. Basically, we fetch BLAS and make it, note the location of blas_LINUX.a, download CBLAS and make some changes as follows to the make file:
    Code:
    # path to just-compiled static lib
    # NOTE your path will be different
    BLLIB = /home/sneakyimp/cblas/BLAS-3.11.0/blas_LINUX.a
    CBLIB = ../lib/cblas_$(PLAT).so
    
    CFLAGS = -O3 -DADD_ -fPIC
    
    FFLAGS = -O3 -fPIC
    
    ARCH = gcc
    
    ARCHFLAGS = -shared -o
    Then we make CBLAS. This takes maybe 30 seconds, and yields cblas_LINUX.so

    I then modify this simplified cblas.h file, which I derived from this example blas.h file, so that defines FFI_LIB to point directly to that cblas_LINUX.so file's full path.
    Code:
    #define FFI_SCOPE "blas"
    #define FFI_LIB "/home/sneakyimp/cblas/CBLAS/lib/cblas_LINUX.so"
    
    typedef  size_t CBLAS_INDEX_t;
    
    
    size_t cblas_idamax(const int N, const float *X, const int incX);
    
    double cblas_dsdot(const int N, const float *X, const int incX, const float *Y,
                       const int incY);
    double cblas_ddot(const int N, const double *X, const int incX,
                      const double *Y, const int incY);
    double cblas_dnrm2(const int N, const double *X, const int incX);
    double cblas_dasum(const int N, const double *X, const int incX);
    
    void cblas_dswap(const int N, double *X, const int incX, 
                     double *Y, const int incY);
    void cblas_dcopy(const int N, const double *X, const int incX, 
                     double *Y, const int incY);
    void cblas_daxpy(const int N, const double alpha, const double *X,
                     const int incX, double *Y, const int incY);
    void cblas_drotg(double *a, double *b, double *c, double *s);
    void cblas_drotmg(double *d1, double *d2, double *b1, const double b2, double *P);
    void cblas_drot(const int N, double *X, const int incX,
                    double *Y, const int incY, const double c, const double  s);
    void cblas_drotm(const int N, double *X, const int incX,
                    double *Y, const int incY, const double *P);
    void cblas_dscal(const int N, const double alpha, double *X, const int incX);
    void cblas_dgemv(const enum CBLAS_ORDER order,
                     const enum CBLAS_TRANSPOSE TransA, const int M, const int N,
                     const double alpha, const double *A, const int lda,
                     const double *X, const int incX, const double beta,
                     double *Y, const int incY);
    void cblas_dgbmv(const enum CBLAS_ORDER order,
                     const enum CBLAS_TRANSPOSE TransA, const int M, const int N,
                     const int KL, const int KU, const double alpha,
                     const double *A, const int lda, const double *X,
                     const int incX, const double beta, double *Y, const int incY);
    void cblas_dtrmv(const enum CBLAS_ORDER order, const enum CBLAS_UPLO Uplo,
                     const enum CBLAS_TRANSPOSE TransA, const enum CBLAS_DIAG Diag,
                     const int N, const double *A, const int lda, 
                     double *X, const int incX);
    void cblas_dtbmv(const enum CBLAS_ORDER order, const enum CBLAS_UPLO Uplo,
                     const enum CBLAS_TRANSPOSE TransA, const enum CBLAS_DIAG Diag,
                     const int N, const int K, const double *A, const int lda, 
                     double *X, const int incX);
    void cblas_dtpmv(const enum CBLAS_ORDER order, const enum CBLAS_UPLO Uplo,
                     const enum CBLAS_TRANSPOSE TransA, const enum CBLAS_DIAG Diag,
                     const int N, const double *Ap, double *X, const int incX);
    void cblas_dtrsv(const enum CBLAS_ORDER order, const enum CBLAS_UPLO Uplo,
                     const enum CBLAS_TRANSPOSE TransA, const enum CBLAS_DIAG Diag,
                     const int N, const double *A, const int lda, double *X,
                     const int incX);
    void cblas_dtbsv(const enum CBLAS_ORDER order, const enum CBLAS_UPLO Uplo,
                     const enum CBLAS_TRANSPOSE TransA, const enum CBLAS_DIAG Diag,
                     const int N, const int K, const double *A, const int lda,
                     double *X, const int incX);
    void cblas_dtpsv(const enum CBLAS_ORDER order, const enum CBLAS_UPLO Uplo,
                     const enum CBLAS_TRANSPOSE TransA, const enum CBLAS_DIAG Diag,
                     const int N, const double *Ap, double *X, const int incX);
    void cblas_dsymv(const enum CBLAS_ORDER order, const enum CBLAS_UPLO Uplo,
                     const int N, const double alpha, const double *A,
                     const int lda, const double *X, const int incX,
                     const double beta, double *Y, const int incY);
    void cblas_dsbmv(const enum CBLAS_ORDER order, const enum CBLAS_UPLO Uplo,
                     const int N, const int K, const double alpha, const double *A,
                     const int lda, const double *X, const int incX,
                     const double beta, double *Y, const int incY);
    void cblas_dspmv(const enum CBLAS_ORDER order, const enum CBLAS_UPLO Uplo,
                     const int N, const double alpha, const double *Ap,
                     const double *X, const int incX,
                     const double beta, double *Y, const int incY);
    void cblas_dger(const enum CBLAS_ORDER order, const int M, const int N,
                    const double alpha, const double *X, const int incX,
                    const double *Y, const int incY, double *A, const int lda);
    void cblas_dsyr(const enum CBLAS_ORDER order, const enum CBLAS_UPLO Uplo,
                    const int N, const double alpha, const double *X,
                    const int incX, double *A, const int lda);
    void cblas_dspr(const enum CBLAS_ORDER order, const enum CBLAS_UPLO Uplo,
                    const int N, const double alpha, const double *X,
                    const int incX, double *Ap);
    void cblas_dsyr2(const enum CBLAS_ORDER order, const enum CBLAS_UPLO Uplo,
                    const int N, const double alpha, const double *X,
                    const int incX, const double *Y, const int incY, double *A,
                    const int lda);
    void cblas_dspr2(const enum CBLAS_ORDER order, const enum CBLAS_UPLO Uplo,
                    const int N, const double alpha, const double *X,
                    const int incX, const double *Y, const int incY, double *A);
    void cblas_dgemm(const enum CBLAS_ORDER Order, const enum CBLAS_TRANSPOSE TransA,
                     const enum CBLAS_TRANSPOSE TransB, const int M, const int N,
                     const int K, const double alpha, const double *A,
                     const int lda, const double *B, const int ldb,
                     const double beta, double *C, const int ldc);
    void cblas_dsymm(const enum CBLAS_ORDER Order, const enum CBLAS_SIDE Side,
                     const enum CBLAS_UPLO Uplo, const int M, const int N,
                     const double alpha, const double *A, const int lda,
                     const double *B, const int ldb, const double beta,
                     double *C, const int ldc);
    void cblas_dsyrk(const enum CBLAS_ORDER Order, const enum CBLAS_UPLO Uplo,
                     const enum CBLAS_TRANSPOSE Trans, const int N, const int K,
                     const double alpha, const double *A, const int lda,
                     const double beta, double *C, const int ldc);
    void cblas_dsyr2k(const enum CBLAS_ORDER Order, const enum CBLAS_UPLO Uplo,
                      const enum CBLAS_TRANSPOSE Trans, const int N, const int K,
                      const double alpha, const double *A, const int lda,
                      const double *B, const int ldb, const double beta,
                      double *C, const int ldc);
    void cblas_dtrmm(const enum CBLAS_ORDER Order, const enum CBLAS_SIDE Side,
                     const enum CBLAS_UPLO Uplo, const enum CBLAS_TRANSPOSE TransA,
                     const enum CBLAS_DIAG Diag, const int M, const int N,
                     const double alpha, const double *A, const int lda,
                     double *B, const int ldb);
    void cblas_dtrsm(const enum CBLAS_ORDER Order, const enum CBLAS_SIDE Side,
                     const enum CBLAS_UPLO Uplo, const enum CBLAS_TRANSPOSE TransA,
                     const enum CBLAS_DIAG Diag, const int M, const int N,
                     const double alpha, const double *A, const int lda,
                     double *B, const int ldb);
    NOTE: FFI can only use a limited syntax in these h files.

    Lastly, my PHP script is what starts everything off and invokes the function call. Note the call to cblas_dgemm, which invokes the function in the shared lib.
    Code:
    $m1 = [
        [1, 2, 3],
        [4, 5, 6]
    ];
    $m1t = array_map(null, ...$m1);
    
    // lifted from ghostjat/np, converts matrix to obj with FFI\CData
    function get_cdata($m) {
        // this check should be more specific, but works for now
        if (!is_array($m) || !is_array($m[0])) {
            throw new Exception('param must be array of arrays');
        }
    
        $rowcount = sizeof($m);
        $colcount = sizeof($m[0]);
        $flat_m = [];
        $size = $rowcount * $colcount;
        $cdata = \FFI::cast('double *', \FFI::new("double[$size]"));
        $i = 0;
        foreach($m as $row) {
            foreach($row as $val) {
                $flat_m[$i] = $val;
                $cdata[$i] = $val;
                $i++;
            }
        }
    
        return [
            'rows' => $rowcount,
            'cols' => $colcount,
            'flat_m' => $flat_m,
            'cdata' => $cdata
        ];
    }
    
    $m1_c = get_cdata($m1);
    $m1t_c = get_cdata($m1t);
    
    define('CBLAS_ROW_MAJOR', 101);
    define('CBLAS_NO_TRANS', 111);
    
    $mr_size = $m1_c['rows'] * $m1t_c['cols'];
    $mr_rows = $m1_c['rows'];
    $mr_cols = $m1t_c['cols'];
    $mr_cdata = \FFI::cast('double *', \FFI::new("double[$mr_size]"));
    
    $start = microtime(true);
    // this fn works by pointer, returns some void object, I think?
    $some_val = $ffi_blas->cblas_dgemm(CBLAS_ROW_MAJOR, CBLAS_NO_TRANS, CBLAS_NO_TRANS, $m1_c['rows'], $m1t_c['cols'], $m1_c['cols'], 1.0, $m1_c['cdata'], $m1_c['cols'], $m1t_c['cdata'], $m1t_c['cols'], 0.0, $mr_cdata, $mr_cols);
    echo "cblas_dgemm elapsed time: " . (microtime(true) - $start) . "seconds\n";
    How can this work on one machine but not the others? Is the cblas_LINUX. so library relying on some other library installed on my workstation? What changes must I make, presumably to the compilation of cblas_LINUX.so, to make sure this will work on any machine?

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    I think the first thing I would do would be to run the 'nm' tool on each of the cblas_LINUX.so files to see what symbols are present in each.
    Do the VM versions lack the dgemm_ symbol?
    Does your real machine have the dgemm_ symbol?

    How did cblas_dgemm become dgemm_ ?
    Has some pre-processing step run amok?
    Is it only cblas_dgemm that is broken, or are other ones also mangled as well?

    Next I would explore if the different versions of PHP have something to do with the problem.

    Prefix your PHP command line with
    strace(1) - Linux manual page
    strace -e trace=%file -ff -o tracing
    You'll get a lot of tracing.pid text files, but they should contain the fully resolved path of any .so files you're using.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Apr 2011
    Posts
    34
    Quote Originally Posted by Salem View Post
    I think the first thing I would do would be to run the 'nm' tool on each of the cblas_LINUX.so files to see what symbols are present in each.
    THANK YOU for this advice. I'm more interested in learning the bigger picture here than just solving the specific problem. I have run the nm command on cblas_LINUX.so on my workstation (ubuntu 20, php 8.2) where the code runs, and also on a virtual machine (ubuntu 20, php 7.4) where the code does not run, and a grep search of the nm output for 'dgemm' shows it as U on both machines:
    Code:
    ubuntu 20, php7.4
    0000000000012120 T cblas_dgemm
                     U dgemm_
    ubuntu 20, php 8.2
    0000000000012120 T cblas_dgemm
                     U dgemm_
    There are some other differences between the nm output when I do a DIFF.

    Quote Originally Posted by Salem View Post
    Do the VM versions lack the dgemm_ symbol?
    Does your real machine have the dgemm_ symbol?
    I'm not sure what you mean by 'lack the dgemm_ symbol' ? dgemm_ does appear in the nm output of both machines here, but the "U" indicates this symbols 'undefined.' It seems to me that these cblas_LINUX.so files do refer to this symbol, and probably try to find it by loading some other library (lapack.so?).

    NOTE: I have noticed that if I install the libblas-dev package on any of these workstations:
    Code:
    sudo apt install libblas-dev
    then I can change the FFI_LIB in my cblas.h file to just 'libblas.so':
    Code:
    #define FFI_LIB "libblas.so"
    and the PHP script will run.

    Quote Originally Posted by Salem View Post
    How did cblas_dgemm become dgemm_ ?
    I haven't the foggiest notion. I am simply compiling BLAS and CBLAS as described above. I downloaded the source for both from Netlib

    Quote Originally Posted by Salem View Post
    Has some pre-processing step run amok?
    Is it only cblas_dgemm that is broken, or are other ones also mangled as well?
    I'm not sure what pre-processing you refer to, and if cblas_dgemm is broken, I would wager that others are, also.

    Quote Originally Posted by Salem View Post
    Next I would explore if the different versions of PHP have something to do with the problem.
    As I've mentioned here, I don't think the PHP version is causing the problem. Installing the libblas-dev package gets things working even with different PHP versions. I believe, but do not know for sure, that PHP in all cases successfully hands off a function call (would this be a system call?) to the cblas_LINUX.so file, but that so file is constructed to perform the requested cblas_dgemm operation from some function that it does not itself provide, which must be found somewhere else.


    Quote Originally Posted by Salem View Post
    Prefix your PHP command line with
    strace(1) - Linux manual page
    strace -e trace=%file -ff -o tracing
    You'll get a lot of tracing.pid text files, but they should contain the fully resolved path of any .so files you're using.
    Thank you again for this suggestion. I have done this on the virtual machine ubuntu 20/php 7.4 as you suggest. It fails with the usual error:

    Code:
    $ strace -e trace=%file -ff -o tracing php bar.php
    transpose elapsed time: 0.14313387870789seconds
    php: symbol lookup error: /home/jason/cblas/CBLAS/lib/cblas_LINUX.so: undefined symbol: dgemm_
    But I see the directory now contains a file, tracing.1266875, which loads a lot of SO files, but ends with the following:
    Code:
    lstat("/home/jason/cblas/bar.php", {st_mode=S_IFREG|0664, st_size=2097, ...}) = 0
    lstat("/home/jason/cblas", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
    lstat("/home/jason", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
    lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
    getcwd("/home/jason/cblas", 4096)       = 18
    lstat("/home/jason/cblas/training_data_sets.json", {st_mode=S_IFREG|0664, st_size=9017437, ...}) = 0
    openat(AT_FDCWD, "/home/jason/cblas/training_data_sets.json", O_RDONLY) = 3
    stat("/home/jason/cblas/cblas.h", {st_mode=S_IFREG|0664, st_size=7410, ...}) = 0
    openat(AT_FDCWD, "/home/jason/cblas/cblas.h", O_RDONLY) = 3
    openat(AT_FDCWD, "/home/jason/cblas/CBLAS/lib/cblas_LINUX.so", O_RDONLY|O_CLOEXEC) = 3
    +++ exited with 127 +++
    I hope that I can learn more about the system behavior in this case. I think my PHP code behaves the same on all machines, ends up constructing some system call specific to linux, but probably nearly identical on both machines. My confusion lies in the nature of this attempted handoff to cblas_LINUX.so. My PHP process probably puts data in a memory location and passes pointers to the SO file. Any clarification or additional tools to sleuth out the problem here would be much appreciated.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Undefined symbol error??
    By Jedijacob in forum C Programming
    Replies: 10
    Last Post: 03-18-2005, 04:46 PM
  2. Undefined Symbol error
    By Eavan Hyde in forum C++ Programming
    Replies: 3
    Last Post: 05-06-2004, 03:39 AM
  3. undefined symbol
    By sworc66 in forum C Programming
    Replies: 3
    Last Post: 09-10-2003, 06:29 AM
  4. Undefined symbol 'stdprn'
    By Unregistered in forum C Programming
    Replies: 3
    Last Post: 07-03-2002, 02:05 PM
  5. undefined symbol 'fout'
    By Clane in forum C++ Programming
    Replies: 2
    Last Post: 03-10-2002, 08:52 PM

Tags for this Thread