summaryrefslogtreecommitdiff
path: root/external/clBLAS/CHANGELOG
diff options
context:
space:
mode:
Diffstat (limited to 'external/clBLAS/CHANGELOG')
-rw-r--r--external/clBLAS/CHANGELOG245
1 files changed, 0 insertions, 245 deletions
diff --git a/external/clBLAS/CHANGELOG b/external/clBLAS/CHANGELOG
deleted file mode 100644
index 03b9faff..00000000
--- a/external/clBLAS/CHANGELOG
+++ /dev/null
@@ -1,245 +0,0 @@
-# ########################################################################
-# Copyright 2013 Advanced Micro Devices, Inc.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-# ########################################################################
-
-clBLAS Readme
-
-Version: 1.10
-Release Date: April 2013
-
-ChangeLog:
-____________
-Current Version:
-New:
- * New Level 1 routines added (an 'x' implies all 4 precisions)
- xSWAP, xCOPY, xSCAL, CSSCAL, ZDSCAL, xAXPY, SDOT, DDOT,
- CDOTU, ZDOTU, CDOTC, ZDOTC, xROTG, SROTMG, DROTMG,
- SROT, DROT, CSROT, ZDROT, SROTM, DROTM, SNRM2, DNRM2,
- SCNRM2, DZNRM2, ixAMAX, SASUM, DASUM, SCASUM, DZASUM
- * Samples have been added for the new functions
- * This release tested using the 9.012 runtime driver and the 2.8 APPSDK
-Fixed:
- * Failures in *trsm functions with clMAGMA tests
-Known Issues:
- * Failures & hangs in ztrmm, *trsv, *tpsv functions on Southern Island GPU devices
- * Failures in zgemm functions on Northern Island GPU devices
- * Failures & hangs are expected to be fixed in the upcoming AMD graphics driver versions.
- It is strongly recommended that users keep their graphics driver versions up to date.
-
-____________
-Version 1.8.291:
-Fixed:
- * Failures in the following functions: ssyr2, ssyr2k, strsm, strsv, ssyrk, cher,
- ctrsv, csymm, cher2, ztrmm on Southern Island GPU devices.
- * Failures in the following functions: dsyr, dsyr2, dgemv, dsyrk,
- dsyr2k, zsyr2k on Trinity platforms.
-Known Issues:
- * Failures in *trsm functions with clMAGMA tests
-
-____________
-Version 1.8.269 (Beta, clMAGMA support):
-New:
- * No new routines
- * This release tested using the 8.961 runtime driver and the 2.6 APPSDK
-
-Known Issues:
- * The clBLASTune executable has been observed to hang on Windows. If
- this happens, abort execution of the tune program; it is not required
- for correct operation of the BLAS routines (as of 8.872).
- * clBLAS can return invalid results on CPU devices (as
- of 8.961). The CPU device is primarily a test/debug device, and GPU
- devices are unaffected.
- * clBLAS can return invalid results for double precision functions (dsyr,
- dsyr2, dgemv, dsyrk, dsyr2k, zsyr2k) on Trinity platforms (as of
- 8.961).
- * clBLAS can return invalid results (ssyr2, ssyr2k, strsm, strsv, ssyrk, cher,
- ctrsv, csymm, cher2, ztrmm) on Southern Island GPU devices (as of 8.961).
-
-____________
-Version 1.7 (Beta, clMAGMA support):
-New:
- * New Level 3 routines added (an 'x' implies all 4 precisions)
- CHER2K, ZHER2K
- * New Level 2 routines added (an 'x' implies all 4 precisions)
- xTPMV, xTPSV, SSPVM, DSPMV, CHPMV, ZHPMV, SSPR, DSPR, CHPR, ZHPR,
- SSPR2, DSPR2, CHPR2, ZHPR2, xGBMV, CHBMV, ZHBMV, SSBMV, DSBMV,
- xTBMV, xTBSV
- * Samples have been added for the new functions, but are not fully tested
- * This release tested using the 8.951 runtime driver and the 2.6 APPSDK
- * Note that documentation is incomplete for the new functions
-
-Known Issues:
- * The clBLASTune executable has been observed to hang on Windows. If
- this happens, abort execution of the tune program; it is not required
- for correct operation of the BLAS routines (as of 8.872).
- * clBLAS can return invalid results on CPU devices that support AVX (as
- of 8.951). CPU devices that support up to SSE3 are unaffected. The
- CPU device is primarily a test/debug device, and GPU devices are
- unaffected.
- * clBLAS can return invalid results for double precision functions (dsyr,
- dsyr2, dgemv, dsyrk, dsyr2k, zsyr2k) on Trinity platforms (as of
- 8.951).
- * clBLAS can return invalid results (ssyr, ssyr2, strsv, ctrsv, ssyrk,
- ssyr2k, ztrmm) on Southern Island GPU devices (as of 8.951).
-
-____________
-Version 1.6:
-New:
- * New Level 3 routines added (an 'x' implies all 4 precisions)
- CSYRK, ZSYRK, CSYR2K, ZSYR2K, CHEMM, ZHEMM, CHERK, ZHERK, xSYMM
- * New Level 2 routines added (an 'x' implies all 4 precisions)
- CGEMV, ZGEMV, xTRMV, xTRSV, CHEMV, ZHEMV, SGER, DGER, CGERU, ZGERU,
- CGERC, ZGERC, CHER, ZHER, CHER2, ZHER2, SSYR, DSYR, SSYR2, DSYR2
- * For all the original functions prior to 1.6, a new API has been introduced
- with an *Ex suffix. These extended API's add new parameters that allow
- users to specify an offset to a matrix argument. This allows efficient
- sub-matrix indexing within a clBLAS routine without requiring expensive
- sub-matrix copy operations.
- * Samples have been added for the new functions
- * Preview: Support for AMD Radeon™ HD7000 series GPUs
- * This release tested using the 8.92 runtime driver and the 2.6 APP SDK
-
-Known Issues:
- * The clBLASTune executable has been observed to hang on Windows. If this
- happens, abort execution of the tune program; it is not required for
- correct operation of the BLAS routines (as of 8.872).
- * The CPU device for clBLAS is not functioning for this release (as of
- 8.872). The CPU device is primarily a test/debug device, and GPU
- devices are unaffected.
-
-____________
-Version 1.4:
-New:
- * New Level 3 routines added
- SSYRK, DSYRK, SSYR2K, DSYR2K
- * New Level 2 routines added
- SGEMV, DGEMV, SSYMV, DSYMV
- * The image support functions (clblasAddScratchImage,
- clblasRemoveScratchImage) have been deprecated. Images are no
- longer required for the highest performance.
- * InstallShield is now used for APPML libraries. The default install
- location has changed from c:\amd\clBLAS to
- C:\Program Files (x86)\AMD\clBLAS. It is recommended that previous
- versions of clBLAS are uninstalled first.
- * Samples have been added for the new functions
- * This release tested using the 8.872 runtime driver and the 2.5 APP SDK
-
-Known Issues:
- * The clBLASTune executable has been observed to hang on Windows. If this
- happens, abort execution of the tune program; it is not required for
- correct operation of the BLAS routines (as of 8.872).
- * The CPU device for clBLAS is not functioning for this release (as of
- 8.872). The CPU device is primarily a test/debug device, and GPU
- devices are unaffected.
-
-
-____________
-Version 1.2:
- * The library now supports both 32- and 64-bit Windows and Linux operating
- systems.
- * xTRSM routines are available in 1.2.
- * clBLAS routines return clBLASStatus error codes, instead of native
- OpenCL error codes
-
-Fixed:
- * xTRMM routines were not properly handling implicit unit diagonal
- elements and implicit off-diagonal zero values specified by the BLAS
- parameters SIDE, UPLO and DIAG.
- * Possible crash with CPU device on 32-bit systems.
- * clblasDgemm routine return an invalid event as its last argument.
- * clBLAS routines return clblasStatus error codes, instead of
- native OpenCL error codes.
-
-Known Issues:
- * The clBLASTune executable has been observed to hang on Windows. If this
- happens, abort execution of the tune program; it is not required for
- correct operation of the BLAS routines (as of 8.872).
- * The CPU device for clBLAS is not functioning for this release (as of
- 8.872). The CPU device is primarily a test/debug device, and GPU
- devices are unaffected.
-
-____________________
-Version 1.0:
- * Initial release
-
-Known Issues:
- * Available only on Linux64.
- * xTRMM routines were not properly handling implicit unit diagonal elements
- and implicit off-diagonal zero values specified by the BLAS parameters
- SIDE, UPLO and DIAG
- * clblasDgemm returned an invalid event as its last argument
-
-_____________
-Building the Samples:
-
-To install the Linux versions of clBLAS, uncompress the initial download, then
-execute the install script.
-
-For example:
-
- tar -xf clBLAS-${version}-Linux.tar.gz
- - This installs three files into the local directory, one being an
- executable bash script.
-
- sudo mkdir /opt/clBLAS-${version}
- - This pre-creates the install directory with proper permissions
- in /opt if it is to be installed there. (This is the default.)
-
- ./install-clBLAS-${version}.sh
- - This prints an EULA and uncompresses files into the chosen install
- directory.
-
- cd ${installDir}/bin64
- export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${OpenCLLibDir}:${clBLASLibDir}
- - Be sure to export library dependencies to resolve all external
- linkages to the client program; you can create a bash script to
- help automate this procedure.
-
- ./example_sgemm
- - Run a simple client; one example is provided for each supported
- main BLAS function family.
-
-The sample program does not ship with native build files; instead, a CMake
-file is shipped, and the user generates a native build file for their system.
-
-For example:
-
- cd ${installDir}
-
- mkdir samplesBin/
- - This creates a sister directory to the samples directory that
- houses the native makefiles and the generated files from the
- build.
-
- cd samplesBin/
- ccmake ../samples/
- - ccmake is a curses-based cmake program; it takes a parameter
- that specifies the location of the source code to compile.
- - Hit 'c' to configure for the platform; ensure that the
- dependencies to external libraries are satisfied, including
- paths to 'ATI Stream SDK'.
- - After dependencies are satisfied, hit 'c' again to finalize
- configuration. Then, hit 'g' to generate a makefile and
- exit ccmake.
-
- make help
- - Look at the options available for make.
-
- make
- - Build the sample client program.
-
- ./example_sgemm
- - Run a simple client; one example is provided for each supported main
- BLAS function family.