diff options
Diffstat (limited to 'external/clBLAS/CHANGELOG')
-rw-r--r-- | external/clBLAS/CHANGELOG | 245 |
1 files changed, 0 insertions, 245 deletions
diff --git a/external/clBLAS/CHANGELOG b/external/clBLAS/CHANGELOG deleted file mode 100644 index 03b9faff..00000000 --- a/external/clBLAS/CHANGELOG +++ /dev/null @@ -1,245 +0,0 @@ -# ######################################################################## -# Copyright 2013 Advanced Micro Devices, Inc. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ######################################################################## - -clBLAS Readme - -Version: 1.10 -Release Date: April 2013 - -ChangeLog: -____________ -Current Version: -New: - * New Level 1 routines added (an 'x' implies all 4 precisions) - xSWAP, xCOPY, xSCAL, CSSCAL, ZDSCAL, xAXPY, SDOT, DDOT, - CDOTU, ZDOTU, CDOTC, ZDOTC, xROTG, SROTMG, DROTMG, - SROT, DROT, CSROT, ZDROT, SROTM, DROTM, SNRM2, DNRM2, - SCNRM2, DZNRM2, ixAMAX, SASUM, DASUM, SCASUM, DZASUM - * Samples have been added for the new functions - * This release tested using the 9.012 runtime driver and the 2.8 APPSDK -Fixed: - * Failures in *trsm functions with clMAGMA tests -Known Issues: - * Failures & hangs in ztrmm, *trsv, *tpsv functions on Southern Island GPU devices - * Failures in zgemm functions on Northern Island GPU devices - * Failures & hangs are expected to be fixed in the upcoming AMD graphics driver versions. - It is strongly recommended that users keep their graphics driver versions up to date. - -____________ -Version 1.8.291: -Fixed: - * Failures in the following functions: ssyr2, ssyr2k, strsm, strsv, ssyrk, cher, - ctrsv, csymm, cher2, ztrmm on Southern Island GPU devices. - * Failures in the following functions: dsyr, dsyr2, dgemv, dsyrk, - dsyr2k, zsyr2k on Trinity platforms. -Known Issues: - * Failures in *trsm functions with clMAGMA tests - -____________ -Version 1.8.269 (Beta, clMAGMA support): -New: - * No new routines - * This release tested using the 8.961 runtime driver and the 2.6 APPSDK - -Known Issues: - * The clBLASTune executable has been observed to hang on Windows. If - this happens, abort execution of the tune program; it is not required - for correct operation of the BLAS routines (as of 8.872). - * clBLAS can return invalid results on CPU devices (as - of 8.961). The CPU device is primarily a test/debug device, and GPU - devices are unaffected. - * clBLAS can return invalid results for double precision functions (dsyr, - dsyr2, dgemv, dsyrk, dsyr2k, zsyr2k) on Trinity platforms (as of - 8.961). - * clBLAS can return invalid results (ssyr2, ssyr2k, strsm, strsv, ssyrk, cher, - ctrsv, csymm, cher2, ztrmm) on Southern Island GPU devices (as of 8.961). - -____________ -Version 1.7 (Beta, clMAGMA support): -New: - * New Level 3 routines added (an 'x' implies all 4 precisions) - CHER2K, ZHER2K - * New Level 2 routines added (an 'x' implies all 4 precisions) - xTPMV, xTPSV, SSPVM, DSPMV, CHPMV, ZHPMV, SSPR, DSPR, CHPR, ZHPR, - SSPR2, DSPR2, CHPR2, ZHPR2, xGBMV, CHBMV, ZHBMV, SSBMV, DSBMV, - xTBMV, xTBSV - * Samples have been added for the new functions, but are not fully tested - * This release tested using the 8.951 runtime driver and the 2.6 APPSDK - * Note that documentation is incomplete for the new functions - -Known Issues: - * The clBLASTune executable has been observed to hang on Windows. If - this happens, abort execution of the tune program; it is not required - for correct operation of the BLAS routines (as of 8.872). - * clBLAS can return invalid results on CPU devices that support AVX (as - of 8.951). CPU devices that support up to SSE3 are unaffected. The - CPU device is primarily a test/debug device, and GPU devices are - unaffected. - * clBLAS can return invalid results for double precision functions (dsyr, - dsyr2, dgemv, dsyrk, dsyr2k, zsyr2k) on Trinity platforms (as of - 8.951). - * clBLAS can return invalid results (ssyr, ssyr2, strsv, ctrsv, ssyrk, - ssyr2k, ztrmm) on Southern Island GPU devices (as of 8.951). - -____________ -Version 1.6: -New: - * New Level 3 routines added (an 'x' implies all 4 precisions) - CSYRK, ZSYRK, CSYR2K, ZSYR2K, CHEMM, ZHEMM, CHERK, ZHERK, xSYMM - * New Level 2 routines added (an 'x' implies all 4 precisions) - CGEMV, ZGEMV, xTRMV, xTRSV, CHEMV, ZHEMV, SGER, DGER, CGERU, ZGERU, - CGERC, ZGERC, CHER, ZHER, CHER2, ZHER2, SSYR, DSYR, SSYR2, DSYR2 - * For all the original functions prior to 1.6, a new API has been introduced - with an *Ex suffix. These extended API's add new parameters that allow - users to specify an offset to a matrix argument. This allows efficient - sub-matrix indexing within a clBLAS routine without requiring expensive - sub-matrix copy operations. - * Samples have been added for the new functions - * Preview: Support for AMD Radeon™ HD7000 series GPUs - * This release tested using the 8.92 runtime driver and the 2.6 APP SDK - -Known Issues: - * The clBLASTune executable has been observed to hang on Windows. If this - happens, abort execution of the tune program; it is not required for - correct operation of the BLAS routines (as of 8.872). - * The CPU device for clBLAS is not functioning for this release (as of - 8.872). The CPU device is primarily a test/debug device, and GPU - devices are unaffected. - -____________ -Version 1.4: -New: - * New Level 3 routines added - SSYRK, DSYRK, SSYR2K, DSYR2K - * New Level 2 routines added - SGEMV, DGEMV, SSYMV, DSYMV - * The image support functions (clblasAddScratchImage, - clblasRemoveScratchImage) have been deprecated. Images are no - longer required for the highest performance. - * InstallShield is now used for APPML libraries. The default install - location has changed from c:\amd\clBLAS to - C:\Program Files (x86)\AMD\clBLAS. It is recommended that previous - versions of clBLAS are uninstalled first. - * Samples have been added for the new functions - * This release tested using the 8.872 runtime driver and the 2.5 APP SDK - -Known Issues: - * The clBLASTune executable has been observed to hang on Windows. If this - happens, abort execution of the tune program; it is not required for - correct operation of the BLAS routines (as of 8.872). - * The CPU device for clBLAS is not functioning for this release (as of - 8.872). The CPU device is primarily a test/debug device, and GPU - devices are unaffected. - - -____________ -Version 1.2: - * The library now supports both 32- and 64-bit Windows and Linux operating - systems. - * xTRSM routines are available in 1.2. - * clBLAS routines return clBLASStatus error codes, instead of native - OpenCL error codes - -Fixed: - * xTRMM routines were not properly handling implicit unit diagonal - elements and implicit off-diagonal zero values specified by the BLAS - parameters SIDE, UPLO and DIAG. - * Possible crash with CPU device on 32-bit systems. - * clblasDgemm routine return an invalid event as its last argument. - * clBLAS routines return clblasStatus error codes, instead of - native OpenCL error codes. - -Known Issues: - * The clBLASTune executable has been observed to hang on Windows. If this - happens, abort execution of the tune program; it is not required for - correct operation of the BLAS routines (as of 8.872). - * The CPU device for clBLAS is not functioning for this release (as of - 8.872). The CPU device is primarily a test/debug device, and GPU - devices are unaffected. - -____________________ -Version 1.0: - * Initial release - -Known Issues: - * Available only on Linux64. - * xTRMM routines were not properly handling implicit unit diagonal elements - and implicit off-diagonal zero values specified by the BLAS parameters - SIDE, UPLO and DIAG - * clblasDgemm returned an invalid event as its last argument - -_____________ -Building the Samples: - -To install the Linux versions of clBLAS, uncompress the initial download, then -execute the install script. - -For example: - - tar -xf clBLAS-${version}-Linux.tar.gz - - This installs three files into the local directory, one being an - executable bash script. - - sudo mkdir /opt/clBLAS-${version} - - This pre-creates the install directory with proper permissions - in /opt if it is to be installed there. (This is the default.) - - ./install-clBLAS-${version}.sh - - This prints an EULA and uncompresses files into the chosen install - directory. - - cd ${installDir}/bin64 - export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${OpenCLLibDir}:${clBLASLibDir} - - Be sure to export library dependencies to resolve all external - linkages to the client program; you can create a bash script to - help automate this procedure. - - ./example_sgemm - - Run a simple client; one example is provided for each supported - main BLAS function family. - -The sample program does not ship with native build files; instead, a CMake -file is shipped, and the user generates a native build file for their system. - -For example: - - cd ${installDir} - - mkdir samplesBin/ - - This creates a sister directory to the samples directory that - houses the native makefiles and the generated files from the - build. - - cd samplesBin/ - ccmake ../samples/ - - ccmake is a curses-based cmake program; it takes a parameter - that specifies the location of the source code to compile. - - Hit 'c' to configure for the platform; ensure that the - dependencies to external libraries are satisfied, including - paths to 'ATI Stream SDK'. - - After dependencies are satisfied, hit 'c' again to finalize - configuration. Then, hit 'g' to generate a makefile and - exit ccmake. - - make help - - Look at the options available for make. - - make - - Build the sample client program. - - ./example_sgemm - - Run a simple client; one example is provided for each supported main - BLAS function family. |