歡迎來到Linux教程網
Linux教程網
Linux教程網
Linux教程網
Linux教程網 >> Linux基礎 >> Linux教程 >> Ubuntu 16.04安裝配置TensorFlow GPU版本

Ubuntu 16.04安裝配置TensorFlow GPU版本

日期:2017/2/28 13:46:38   编辑:Linux教程

requirements

  • Ubuntu 16.04
  • python 2.7
  • Flask
  • tensorflow GPU 版本

安裝nvidia driver

經過不斷踩坑的安裝,終於google到了靠譜的方法,首先檢查你的NVIDIA VGA card model

sudo lshw -numeric -C display


可以看到你的顯卡信息,比如我的就是 product: GM107M [GeForce GTX 950M] [10DE:139A],然後去NVDIA driver search page搜索你的顯卡需要的驅動型號,頁面如下:

下面是我的電腦對應的驅動版本

LINUX X64 (AMD64/EM64T) DISPLAY DRIVER

Version:    375.20
Release Date:   2016.11.18
Operating System:   Linux 64-bit
Language:   English (US)
File Size:  72.37 MB

從搜索的結果頁面看到,我的驅動版本應該是375.20,為了再次確認一遍,你還可以使用這個命令查看你可以使用的驅動:

ubuntu-drivers devices

結果顯示和搜索到的驅動版本一樣,推薦也是375

== /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==
vendor   : NVIDIA Corporation
model    : GM107M [GeForce GTX 950M]
modalias : pci:v000010DEd0000139Asv000017AAsd0000380Bbc03sc02i00
driver   : nvidia-367 - third-party free
driver   : nvidia-375 - third-party free recommended
driver   : nvidia-364 - third-party free
driver   : nvidia-358 - third-party free
driver   : xserver-xorg-video-nouveau - distro free builtin
driver   : nvidia-370 - third-party free

== cpu-microcode.py ==
driver   : intel-microcode - distro non-free

好了,終於可以安裝對應的驅動了,使用以下命令

version: 375
sudo apt-get install nvidia-375
//你自己的版本
//version : xxx
//sudo apt-get install nvidia-xxx

什麼,安裝很慢,找不到包?更換一下軟件源,這個自己google怎麼更換,最簡單的就是圖形界面裡面找到System->settings->Software&Updates,然後換一下源,比如阿裡雲或者中科大(我突然不能鏈接中科大鏡像了,真實坑),然後再執行一下命令

sudo apt-get install mesa-common-dev
sudo apt-get install freeglut3-dev

安裝完成之後,重啟電腦,驅動應該就完成了!你可以在dashboard上搜索nvidia,看到像 NVIDIA X Server Settings的東西,就說明安裝驅動成功了,接下來就是安裝cuda8了

安裝cuda8

首先也是去下載cuda toolkit 8.0,可以自己注冊一個賬號。

一定要選擇runfile.下載完成之後,執行

sudo sh cuda_8.0.44_linux.run --override

然後就進入安裝過程,開始都是End User License Agreement,你可以CTRL +C 跳過,然後accept,下面就是安裝的交互界面,開始的Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 367.48?選擇n,因為你已經安裝驅動了。

Using more to view the EULA.
End User License Agreement
--------------------------


Preface
-------

The following contains specific license terms and conditions
for four separate NVIDIA products. By accepting this
agreement, you agree to comply with all the terms and
conditions applicable to the specific product(s) included
herein.


NVIDIA CUDA Toolkit


Description

The NVIDIA CUDA Toolkit provides command-line and graphical
tools for building, debugging and optimizing the performance
of applications accelerated by NVIDIA GPUs, runtime and math
libraries, and documentation including programming guides,
user manuals, and API references. The NVIDIA CUDA Toolkit
License Agreement is available in Chapter 1.


Default Install Location of CUDA Toolkit

Windows platform:

Do you accept the previously read EULA?
accept/decline/quit: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 367.48?
(y)es/(n)o/(q)uit: n

Install the CUDA 8.0 Toolkit?
(y)es/(n)o/(q)uit: y

Enter Toolkit Location
 [ default is /usr/local/cuda-8.0 ]:  

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Install the CUDA 8.0 Samples?
(y)es/(n)o/(q)uit: y 

Enter CUDA Samples Location
 [ default is /home/kinny ]: 

Installing the CUDA Toolkit in /usr/local/cuda-8.0 ...
Missing recommended library: libXmu.so

Installing the CUDA Samples in /home/kinny ...
Copying samples to /home/kinny/NVIDIA_CUDA-8.0_Samples now...
Finished copying samples.

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-8.0
Samples:  Installed in /home/kinny, but missing recommended libraries

Please make sure that
 -   PATH includes /usr/local/cuda-8.0/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-8.0/lib64, or, add /usr/local/cuda-8.0/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-8.0/bin

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-8.0/doc/pdf for detailed information on setting up CUDA.

***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 361.00 is required for CUDA 8.0 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run -silent -driver

Logfile is /tmp/cuda_install_17494.log

配置cuda環境變量

export PATH="$PATH:/usr/local/cuda-8.0/bin"
export LD_LIBRARY_PATH="/usr/local/cuda-8.0/lib64"

nvidia-smi

結果出現以下輸出,說明配置成功

安裝深度學習庫cuDNN

首先��載cuDNN5.1,直接下載是非常慢的,必須走代理,我用的是終端下載的方法,注意前提是你已經注冊為開發者了!

proxychains wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v5.1/prod/8.0/cudnn-8.0-linux-x64-v5.1-tgz
這個會被forbidden,因為沒有認證,開發者需要認證才能下載,你先用chrome下載,然後到show all裡面去copy真實的下載地址
proxychains wget http://developer.download.nvidia.com/compute/machine-learning/cudnn/secure/v5.1/prod/8.0/cudnn-8.0-linux-x64-v5.1.tgz?autho=1479703345_7fbb517b03361780b45a2c43277bb9ac&file=cudnn-8.0-linux-x64-v5.1.tgz
這次成功了!!速度還可以!不過下載下來的文件名字有問題,修改成cudnn-8.0-linux-x64-v5.1.tgz就可以了

然後是解壓
tar xvzf cudnn-8.0-linux-x64-v5.1.tgz
然後將庫和頭文件copy到cuda目錄(一定是你自己安裝的目錄如/usr/local/cuda-8.0),不過正確安裝的話,ubuntu一般就會有軟鏈接/usr/local/cuda -> /usr/local/cuda-8.0/
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

安裝tensorflow gpu enable python 2.7 版本,詳見官網

export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.11.0-cp27-none-linux_x86_64.whl
sudo pip install --upgrade $TF_BINARY_URL

驗證
$python 
Python 2.7.12 (default, Jul  1 2016, 15:12:24) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
>>> quit()
大功告成!

錯誤

1.libcudart.so.8.0: cannot open shared object file: No such file or directory

kinny@kinny-Lenovo-XiaoXin:~/Study/tensorflow-0.11.0rc0/tensorflow/models/image/mnist$ python convolutional.py 
Traceback (most recent call last):
  File "convolutional.py", line 34, in <module>
    import tensorflow as tf
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/__init__.py", line 23, in <module>
    from tensorflow.python import *
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 28, in <module>
    _pywrap_tensorflow = swig_import_helper()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow', fp, pathname, description)
ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory

方法是設置環境變量,把以前設置的cuda環境變量改成一下這樣,這個是tensorflow官網上要求的環境變量;

export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
export CUDA_HOME=/usr/local/cuda

2.TypeError: run() got an unexpected keyword argument ‘argv’

Traceback (most recent call last):
  File "convolutional.py", line 339, in <module>
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
TypeError: run() got an unexpected keyword argument 'argv'

方法是把main裡面的argv參數去掉

使用python 虛擬環境

使用gpu版本運行mnist例子非常慢,基本卡死在數據下載和讀取上了!為了比較gpu和cpu的性能,使用虛擬環境安裝了tensorflow的cpu版本;

sudo apt-get install python-pip python-dev python-virtualenv

mkdir py2virtualenv
virtualenv --system-site-packages ~/py2virtualenv/tensorflowcpu
source ~/py2virtualenv/tensorflowcpu/bin/activate
export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.11.0-cp27-none-linux_x86_64.whl
pip install --upgrade $TF_BINARY_URL

原來cpu版本數據讀取和下載很快!cpu適合做IO和簡單邏輯運算和加減,但是gpu不行,gpu不適合做高IO和加減法,但是在做矩陣運算表現十分強悍,我在把mnist數據集下載到本地後,分別使用cpu版本和gpu版本跑tensorflow/tensorflow/models/image/mnist/convolutional.py,結果顯示:

//cpu版本
Step 8100 (epoch 9.43), 130.6 ms
Minibatch loss: 1.630, learning rate: 0.006302
Minibatch error: 0.0%
Validation error: 0.8%
平均每 100 次 130.64ms 左右

real  19m5.685s
user  67m33.720s
sys 0m12.340s

//gpu版本
Step 8100 (epoch 9.43), 23.2 ms
Minibatch loss: 1.634, learning rate: 0.006302
Minibatch error: 0.0%
Validation error: 0.9%
平均每 100 次 23.2ms 左右

real  3m28.296s
user  2m45.888s
sys 0m29.064s

GPU在矩陣密集運算方面完虐cpu,大概是6倍。我的是GTX 950M,不知道現在的GTX 1080M是什麼情況。

Caffe 深度學習入門教程 http://www.linuxidc.com/Linux/2016-11/136774.htm

Ubuntu 16.04下Matlab2014a+Anaconda2+OpenCV3.1+Caffe安裝 http://www.linuxidc.com/Linux/2016-07/132860.htm

Ubuntu 16.04系統下CUDA7.5配置Caffe教程 http://www.linuxidc.com/Linux/2016-07/132859.htm

Caffe在Ubuntu 14.04 64bit 下的安裝 http://www.linuxidc.com/Linux/2015-07/120449.htm

深度學習框架Caffe在Ubuntu下編譯安裝 http://www.linuxidc.com/Linux/2016-07/133225.htm

Caffe + Ubuntu 14.04 64bit + CUDA 6.5 配置說明 http://www.linuxidc.com/Linux/2015-04/116444.htm

Ubuntu 16.04上安裝Caffe http://www.linuxidc.com/Linux/2016-08/134585.htm

Caffe配置簡明教程 ( Ubuntu 14.04 / CUDA 7.5 / cuDNN 5.1 / OpenCV 3.1 ) http://www.linuxidc.com/Linux/2016-09/135016.htm

Ubuntu 16.04上安裝Caffe(CPU only) http://www.linuxidc.com/Linux/2016-09/135034.htm

更多Ubuntu相關信息見Ubuntu 專題頁面 http://www.linuxidc.com/topicnews.aspx?tid=2

Copyright © Linux教程網 All Rights Reserved