Linux教程網 >> Linux基礎 >> Linux教程 >> TensorFlow 研究實踐筆記

TensorFlow 研究實踐筆記

日期：2017/2/28 13:50:58 编辑：Linux教程

一、Caffe、TensorFlow、MXnet三個開源庫對比
http://www.linuxidc.com/Linux/2016-07/133222.htm
選擇首先學習TensorFlow

二、深度學習研究

TensorFlow在圖像識別中的應用 http://www.linuxidc.com/Linux/2016-07/133227.htm

深度卷積神經網絡的模型在困難的視覺識別任務中取得了理想的效果 —— 達到人類水平，在某些領域甚至超過。

三、TensorFlow安裝:

安裝環境：Ubuntu15.10_64

1、下載源碼
sudo apt-get install git

git clone - -recurse-submodules https://github.com/tensorflow/tensorflow

–recurse-submodules 參數必須要加, 用於獲取 TesorFlow 依賴的 protobuf 庫

Cloning into 'tensorflow'...
remote: Counting objects: 40348, done.
remote: Compressing objects: 100% (7/7), done.
remote: Total 40348 (delta 0), reused 0 (delta 0), pack-reused 40341
Receiving objects: 100% (40348/40348), 35.45 MiB | 404.00 KiB/s, done.
Resolving deltas: 100% (29338/29338), done.
Checking connectivity... done.
Submodule 'google/protobuf' (https://github.com/google/protobuf.git) registered for path 'google/protobuf'
Cloning into 'google/protobuf'...
remote: Counting objects: 32801, done.
remote: Compressing objects: 100% (34/34), done.
remote: Total 32801 (delta 12), reused 0 (delta 0), pack-reused 32767
Receiving objects: 100% (32801/32801), 31.27 MiB | 1.27 MiB/s, done.
Resolving deltas: 100% (22019/22019), done.
Checking connectivity... done.
Submodule path 'google/protobuf': checked out 'fb714b3606bd663b823f6960a73d052f97283b74'

2、安裝Bazel

OpenJDK做為GPL許可（GPL-licensed）的Java平台的開源化實現，Sun正式發布它已經六年有余。從發布那一時刻起，Java社區的大眾們就又開始努力學習，以適應這個新的開源代碼基礎（code-base）。 [1]
OpenJDK在2013年發展迅速，被著名IT雜志SD Times評選為2013 SD Times 100，位於“極大影響力”分類第9位。

Google日前開源了他們內部使用的構建工具Bazel。
Bazel是一個類似於Make的工具，是Google為其內部軟件開發的特點量身定制的工具，如今Google使用它來構建內部大多數的軟件。它的功能有諸多亮點：
多語言支持：目前Bazel默認支持Java、Objective-C和C++，但可以被擴展到其他任何變成語言。

高級構建描述語言：項目是使用一種叫BUILD的語言來描述的，它是一種簡潔的文本語言，它把一個項目視為一個集合，這個集合由一些互相關聯的庫、二進制文件和測試用例組成。相反，像Make這樣的工具，需要去描述每個文件如何調用編譯器。

多平台支持：同一套工具和相同的BUILD文件可以用來為不同的體系結構構建軟件，甚至是不同的平台。在Google，Bazel被同時用在數據中心系統中的服務器應用和手機端的移動應用上。

可重復性：在BUILD文件中，每個庫、測試用例和二進制文件都需要明確指定它們的依賴關系。當一個源碼文件被修改時，Bazel憑這些依賴來判斷哪些部分需要重新構建，以及哪些任務可以並行進行。這意味著所有構建都是增量的，並且相同構建總是產生一樣的結果。

可伸縮性：Bazel可以處理大型項目；在Google，一個服務器軟件有十萬行代碼是很常見的，在什麼都不改的前提下重新構建這樣一個項目，大概只需要200毫秒。

安裝Bazel依賴庫
sudo apt-get install openjdk-8-jdk openjdk-8-source

oot.pem
Adding debian:E-Tugra_Certification_Authority.pem
Adding debian:Staat_der_Nederlanden_EV_Root_CA.pem
Adding debian:GlobalSign_ECC_Root_CA_-_R4.pem
Adding debian:Certinomis_-_Autorité_Racine.pem
Adding debian:ssl-cert-snakeoil.pem
Adding debian:COMODO_Certification_Authority.pem
done.
Processing triggers for libc-bin (2.21-0ubuntu4) ...
Processing triggers for ca-certificates (20150426ubuntu1) ...
Updating certificates in /etc/ssl/certs...
0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...

done.
done.
learning@learning-virtual-machine:~$

sudo apt-get install pkg-config zip g++ zlib1g-dev unzip

Processing triggers for mime-support (3.54ubuntu1.1) ...
Setting up libstdc++-4.8-dev:amd64 (4.8.4-2ubuntu1~14.04.1) ...
Setting up g++-4.8 (4.8.4-2ubuntu1~14.04.1) ...
Setting up g++ (4:4.8.2-1ubuntu6) ...
update-alternatives: using /usr/bin/g++ to provide /usr/bin/c++ (c++) in auto mode
Setting up unzip (6.0-9ubuntu1.5) ...
Setting up zlib1g-dev:amd64 (1:1.2.8.dfsg-1ubuntu1) ...
@ubuntu:~$

下載鏈接：https://github.com/bazelbuild/bazel/releases/download/0.2.2b/bazel-0.2.2b-installer-linux-x86_64.sh
@ubuntu:~$ chmod +x bazel-0.2.2b-installer-linux-x86_64.sh
@ubuntu:~$ ./bazel-0.2.2b-installer-linux-x86_64.sh –user

Bazel is now installed!

Make sure you have "/home/learning/bin" in your path. You can also activate bash
completion by adding the following line to your ~/.bashrc:
  source /home/learning/.bazel/bin/bazel-complete.bash

See http://bazel.io/docs/getting-started.html to start a new project!
learning@learning-virtual-machine:~$ source /home/learning/.bazel/bin/bazel-complete.bash
learning@learning-virtual-machine:~$

 export PATH="$PATH:$HOME/bin"

sudo apt-get install python-numpy swig python-dev

blapack.so.3 (liblapack.so.3) in auto mode
Setting up libpython-dev:amd64 (2.7.5-5ubuntu3) ...
Setting up python2.7-dev (2.7.6-8ubuntu0.2) ...
Setting up python-dev (2.7.5-5ubuntu3) ...
Setting up python-numpy (1:1.8.2-0ubuntu0.1) ...
Setting up swig2.0 (2.0.11-1ubuntu2) ...
Setting up swig (2.0.11-1ubuntu2) ...
Processing triggers for libc-bin (2.19-0ubuntu6.5) ...

3、

mkdir /tmp/tensorflow_pkg
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
pip install /tmp/tensorflow_pkg/tensorflow-0.5.0-py2-none-any.whl

learning@learning-virtual-machine:~$ pip install /tmp/tensorflow_pkg/tensorflow-0.5.0-py2-none-any.whl
Requirement '/tmp/tensorflow_pkg/tensorflow-0.5.0-py2-none-any.whl' looks like a filename, but the file does not exist
Unpacking /tmp/tensorflow_pkg/tensorflow-0.5.0-py2-none-any.whl
Cleaning up...
Exception:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/pip/basecommand.py", line 122, in main
    status = self.run(options, args)
  File "/usr/lib/python2.7/dist-packages/pip/commands/install.py", line 304, in run
    requirement_set.prepare_files(finder, force_root_egg_info=self.bundle, bundle=self.bundle)
  File "/usr/lib/python2.7/dist-packages/pip/req.py", line 1198, in prepare_files
    do_download,
  File "/usr/lib/python2.7/dist-packages/pip/req.py", line 1365, in unpack_url
    unpack_file_url(link, location, download_dir)
  File "/usr/lib/python2.7/dist-packages/pip/download.py", line 640, in unpack_file_url
    unpack_file(from_path, location, content_type, link)
  File "/usr/lib/python2.7/dist-packages/pip/util.py", line 640, in unpack_file
    unzip_file(filename, location, flatten=not filename.endswith(('.pybundle', '.whl')))
  File "/usr/lib/python2.7/dist-packages/pip/util.py", line 508, in unzip_file
    zipfp = open(filename, 'rb')
IOError: [Errno 2] No such file or directory: '/tmp/tensorflow_pkg/tensorflow-0.5.0-py2-none-any.whl'

Storing debug log for failure in /home/learning/.pip/pip.log
learning@learning-virtual-machine:~$

使用pip編譯並安裝
bazel build -c opt tensorflow/tools/pip_package:build_pip_package

learning@learning-virtual-machine:~/tensorflow$ bazel build -c opt tensorflow/tools/pip_package:build_pip_package
Sending SIGTERM to previous Bazel server (pid=17411)... done.
.......................................
INFO: Waiting for response from Bazel server (pid 18433)...
INFO: Downloading from https://bitbucket.org/eigen/eigen/get/50812b426b7c.tar.\
gz: 0B

出現問題：

ERROR: /home/learning/tensorflow/tensorflow/core/kernels/BUILD:640:1: C++ compilation of rule '//tensorflow/core/kernels:padding_fifo_queue' failed: gcc failed: error executing command /usr/bin/gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -Wl,-z,-relro,-z,now -B/usr/bin -B/usr/bin -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 ... (remaining 70 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 4.
gcc: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-5/README.Bugs> for instructions.
[1,604 / 2,192] Still waiting for 199 jobs to complete:
      Running (standalone):
        Compiling tensorflow/core/kernels/queue_base.cc, 5653 s
        Compiling tensorflow/core/kernels/split_lib_cpu.cc, 15 s

解決：內存不夠，將虛擬機內存改為4G，編譯成功

INFO: From Compiling
tensorflow/contrib/tensor_forest/core/ops/update_fertile_slots_op.cc:
tensorflow/contrib/tensor_forest/core/ops/update_fertile_slots_op.cc:
In member function ‘virtual void
tensorflow::UpdateFertileSlots::Compute(tensorflow::OpKernelContext*)’:
tensorflow/contrib/tensor_forest/core/ops/update_fertile_slots_op.cc:176:14:
warning: comparison between signed and unsigned integer expressions
[-Wsign-compare]
for (; i < values->size(); ++i) {
^ tensorflow/contrib/tensor_forest/core/ops/update_fertile_slots_op.cc:
In member function ‘void
tensorflow::UpdateFertileSlots::SetNewNonFertileLeaves(tensorflow::UpdateFertileSlots::HeapValuesType*,
int, tensorflow::OpKernelContext*)’:
tensorflow/contrib/tensor_forest/core/ops/update_fertile_slots_op.cc:340:29:
warning: comparison between signed and unsigned integer expressions
[-Wsign-compare]
for (int32 i = start; i < values->size(); ++i) {
^ Target //tensorflow/tools/pip_package:build_pip_package up-to-date:
bazel-bin/tensorflow/tools/pip_package/build_pip_package INFO: Elapsed
time: 9696.811s, Critical Path: 7936.35s

bazel build -c opt tensorflow/tools/pip_package:build_pip_package

learning@learning-virtual-machine:~/tensorflow$ mkdir /tmp/tensorflow_pkg
learning@learning-virtual-machine:~/tensorflow$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
2016年 05月 06日星期五 11:22:55 CST : === Using tmpdir: /tmp/tmp.n9viqhep4u
/tmp/tmp.n9viqhep4u ~/tensorflow
2016年 05月 06日星期五 11:23:01 CST : === Building wheel
2016年 05月 06日星期五 11:24:09 CST : === Output wheel file is in: /tmp/tensorflow_pkg
learning@learning-virtual-machine:~/tensorflow$

pip install /tmp/tensorflow_pkg/tensorflow-0.8.0-py2-none-any.whl

learning@learning-virtual-machine:/tmp/tensorflow_pkg$ pip install /tmp/tensorflow_pkg/tensorflow-0.8.0-py2-none-any.whl
Unpacking ./tensorflow-0.8.0-py2-none-any.whl
Downloading/unpacking six>=1.10.0 (from tensorflow==0.8.0)
  Cannot fetch index base URL https://pypi.python.org/simple/
  Downloading six-1.10.0-py2.py3-none-any.whl
Downloading/unpacking protobuf==3.0.0b2 (from tensorflow==0.8.0)
  Downloading protobuf-3.0.0b2-py2.py3-none-any.whl (326kB): 326kB downloaded
Downloading/unpacking wheel (from tensorflow==0.8.0)
  Downloading wheel-0.29.0-py2.py3-none-any.whl (66kB): 66kB downloaded
Downloading/unpacking numpy>=1.8.2 (from tensorflow==0.8.0)

m/mtrand/randomkit.o build/temp.linux-x86_64-2.7/numpy/random/mtrand/initarray.o build/temp.linux-x86_64-2.7/numpy/random/mtrand/distributions.o -Lbuild/temp.linux-x86_64-2.7 -o build/lib.linux-x86_64-2.7/numpy/random/mtrand.so
Creating build/scripts.linux-x86_64-2.7/f2py
adding ‘build/scripts.linux-x86_64-2.7/f2py’ to scripts
changing mode of build/scripts.linux-x86_64-2.7/f2py from 664 to 775

warning: no previously-included files matching '*.pyo' found anywhere in distribution
warning: no previously-included files matching '*.pyd' found anywhere in distribution
changing mode of /home/learning/.local/bin/f2py to 775

Successfully installed tensorflow six protobuf wheel numpy setuptools
Cleaning up…

創建 pip 包並安裝，編譯安裝結束。

1、問題：

The 'build' command is only supported from within a workspace.

解決方法：

learning@learning-virtual-machine:**~/tensorflow**$ bazel build -c opt tensorflow/tools/pip_package:build_pip_package
.........................

2、問題：

INFO: Waiting for response from Bazel server (pid 15464)… ERROR:
/home/learning/tensorflow/WORKSPACE:16:6: First argument of load() is
a path, not a label. It should start with a single slash if it is an
absolute path.. ERROR: /home/learning/tensorflow/WORKSPACE:20:6: First
argument of load() is a path, not a label. It should start with a
single slash if it is an absolute path.. ERROR: WORKSPACE file could
not be parsed. ERROR: no such package ‘external’: Package ‘external’
contains errors. INFO: Elapsed time: 9.814s

解決方法：bazel版本低，換成0.2.2

源碼分析：
example_trainer.cc

/* Copyright 2015 Google Inc. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/

#include <cstdio>
#include <functional>
#include <string>
#include <vector>

#include "tensorflow/cc/ops/standard_ops.h"
#include "tensorflow/core/framework/graph.pb.h"
#include "tensorflow/core/framework/tensor.h"
#include "tensorflow/core/graph/default_device.h"
#include "tensorflow/core/graph/graph_def_builder.h"
#include "tensorflow/core/lib/core/threadpool.h"
#include "tensorflow/core/lib/strings/stringprintf.h"
#include "tensorflow/core/platform/init_main.h"
#include "tensorflow/core/platform/logging.h"
#include "tensorflow/core/platform/types.h"
#include "tensorflow/core/public/session.h"

using tensorflow::string;
using tensorflow::int32;

namespace tensorflow {
namespace example {

struct Options {
  int num_concurrent_sessions = 10;  // The number of concurrent sessions
  int num_concurrent_steps = 10;     // The number of concurrent steps
  int num_iterations = 100;          // Each step repeats this many times
  bool use_gpu = false;              // Whether to use gpu in the training
};

// A = [3 2; -1 0]; x = rand(2, 1);
// We want to compute the largest eigenvalue for A.
// repeat x = y / y.norm(); y = A * x; end
GraphDef CreateGraphDef() {
  // TODO(jeff,opensource): This should really be a more interesting
  // computation.  Maybe turn this into an mnist model instead?
  GraphDefBuilder b;
  using namespace ::tensorflow::ops;  // NOLINT(build/namespaces)
  // Store rows [3, 2] and [-1, 0] in row major format.
  Node* a = Const({3.f, 2.f, -1.f, 0.f}, {2, 2}, b.opts());

  // x is from the feed.
  Node* x = Const({0.f}, {2, 1}, b.opts().WithName("x"));

  // y = A * x
  Node* y = MatMul(a, x, b.opts().WithName("y"));

  // y2 = y.^2
  Node* y2 = Square(y, b.opts());

  // y2_sum = sum(y2)
  Node* y2_sum = Sum(y2, Const(0, b.opts()), b.opts());

  // y_norm = sqrt(y2_sum)
  Node* y_norm = Sqrt(y2_sum, b.opts());

  // y_normalized = y ./ y_norm
  Div(y, y_norm, b.opts().WithName("y_normalized"));

  GraphDef def;
  TF_CHECK_OK(b.ToGraphDef(&def));
  return def;
}

string DebugString(const Tensor& x, const Tensor& y) {
  CHECK_EQ(x.NumElements(), 2);
  CHECK_EQ(y.NumElements(), 2);
  auto x_flat = x.flat<float>();
  auto y_flat = y.flat<float>();
  const float lambda = y_flat(0) / x_flat(0);
  return strings::Printf("lambda = %8.6f x = [%8.6f %8.6f] y = [%8.6f %8.6f]",
                         lambda, x_flat(0), x_flat(1), y_flat(0), y_flat(1));
}

void ConcurrentSteps(const Options* opts, int session_index) {
  // Creates a session.
  SessionOptions options;
  std::unique_ptr<Session> session(NewSession(options));
  GraphDef def = CreateGraphDef();
  if (options.target.empty()) {
    graph::SetDefaultDevice(opts->use_gpu ? "/gpu:0" : "/cpu:0", &def);
  }

  TF_CHECK_OK(session->Create(def));

  // Spawn M threads for M concurrent steps.
  const int M = opts->num_concurrent_steps;
  thread::ThreadPool step_threads(Env::Default(), "trainer", M);

  for (int step = 0; step < M; ++step) {
    step_threads.Schedule([&session, opts, session_index, step]() {
      // Randomly initialize the input.
      Tensor x(DT_FLOAT, TensorShape({2, 1}));
      x.flat<float>().setRandom();

      // Iterations.
      std::vector<Tensor> outputs;
      for (int iter = 0; iter < opts->num_iterations; ++iter) {
        outputs.clear();
        TF_CHECK_OK(
            session->Run({{"x", x}}, {"y:0", "y_normalized:0"}, {}, &outputs));
        CHECK_EQ(size_t{2}, outputs.size());

        const Tensor& y = outputs[0];
        const Tensor& y_norm = outputs[1];
        // Print out lambda, x, and y.
        std::printf("%06d/%06d %s\n", session_index, step,
                    DebugString(x, y).c_str());
        // Copies y_normalized to x.
        x = y_norm;
      }
    });
  }

  TF_CHECK_OK(session->Close());
}

void ConcurrentSessions(const Options& opts) {
  // Spawn N threads for N concurrent sessions.
  const int N = opts.num_concurrent_sessions;
  thread::ThreadPool session_threads(Env::Default(), "trainer", N);
  for (int i = 0; i < N; ++i) {
    session_threads.Schedule(std::bind(&ConcurrentSteps, &opts, i));
  }
}

}  // end namespace example
}  // end namespace tensorflow

namespace {

bool ParseInt32Flag(tensorflow::StringPiece arg, tensorflow::StringPiece flag,
                    int32* dst) {
  if (arg.Consume(flag) && arg.Consume("=")) {
    char extra;
    return (sscanf(arg.data(), "%d%c", dst, &extra) == 1);
  }

  return false;
}

bool ParseBoolFlag(tensorflow::StringPiece arg, tensorflow::StringPiece flag,
                   bool* dst) {
  if (arg.Consume(flag)) {
    if (arg.empty()) {
      *dst = true;
      return true;
    }

    if (arg == "=true") {
      *dst = true;
      return true;
    } else if (arg == "=false") {
      *dst = false;
      return true;
    }
  }

  return false;
}

}  // namespace

int main(int argc, char* argv[]) {
  tensorflow::example::Options opts;
  std::vector<char*> unknown_flags;
  for (int i = 1; i < argc; ++i) {
    if (string(argv[i]) == "--") {
      while (i < argc) {
        unknown_flags.push_back(argv[i]);
        ++i;
      }
      break;
    }

    if (ParseInt32Flag(argv[i], "--num_concurrent_sessions",
                       &opts.num_concurrent_sessions) ||
        ParseInt32Flag(argv[i], "--num_concurrent_steps",
                       &opts.num_concurrent_steps) ||
        ParseInt32Flag(argv[i], "--num_iterations", &opts.num_iterations) ||
        ParseBoolFlag(argv[i], "--use_gpu", &opts.use_gpu)) {
      continue;
    }

    fprintf(stderr, "Unknown flag: %s\n", argv[i]);
    return -1;
  }

  // Passthrough any unknown flags.
  int dst = 1;  // Skip argv[0]
  for (char* f : unknown_flags) {
    argv[dst++] = f;
  }
  argv[dst++] = nullptr;
  argc = unknown_flags.size() + 1;
  tensorflow::port::InitMain(argv[0], &argc, &argv);
  tensorflow::example::ConcurrentSessions(opts);
}