歡迎來到Linux教程網
Linux教程網
Linux教程網
Linux教程網
Linux教程網 >> Linux基礎 >> Linux教程 >> Linux內存管理之slab機制(創建cache)

Linux內存管理之slab機制(創建cache)

日期:2017/2/28 15:59:50   编辑:Linux教程
Linux內核中創建cache節點由函數kmem_cache_create()實現。

該函數的執行流程:

1,從全局cache_cache中獲得cache結構,因為全局cache_cache初始化對象的大小就是kmem_cache結構的大小,所以返回的指針正好可以轉換為cache結構;調用 kmem_cache_zalloc(&cache_cache, gfp);

2,獲得slab中碎片大小,由函數calculate_slab_order()實現;

3,計算並初始化cache的各種屬性,如果是外置式,需要用kmem_find_general_cachep(slab_size, 0u)指定cachep->slabp_cache,用於存放slab對象和kmem_bufctl_t[]數組;

4,設置每個CPU上得本地cache,setup_cpu_cache();

5,cache創建完畢,將其加入到全局slab cache鏈表中;

一、主實現

[cpp]
  1. /**
  2. * kmem_cache_create - Create a cache.
  3. * @name: A string which is used in /proc/slabinfo to identify this cache.
  4. * @size: The size of objects to be created in this cache.
  5. * @align: The required alignment for the objects.
  6. * @flags: SLAB flags
  7. * @ctor: A constructor for the objects.
  8. *
  9. * Returns a ptr to the cache on success, NULL on failure.
  10. * Cannot be called within a int, but can be interrupted.
  11. * The @ctor is run when new pages are allocated by the cache.
  12. *
  13. * @name must be valid until the cache is destroyed. This implies that
  14. * the module calling this has to destroy the cache before getting unloaded.
  15. * Note that kmem_cache_name() is not guaranteed to return the same pointer,
  16. * therefore applications must manage it themselves.
  17. *
  18. * The flags are
  19. *
  20. * %SLAB_POISON - Poison the slab with a known test pattern (a5a5a5a5)
  21. * to catch references to uninitialised memory.
  22. *
  23. * %SLAB_RED_ZONE - Insert `Red' zones around the allocated memory to check
  24. * for buffer overruns.
  25. *
  26. * %SLAB_HWCACHE_ALIGN - Align the objects in this cache to a hardware
  27. * cacheline. This can be beneficial if you're counting cycles as closely
  28. * as davem.
  29. */
  30. /*創建slab系統頂層的cache節點。創建完成後,cache
  31. 裡並沒有任何slab以及對象,只有當分配對象
  32. ,並且cache中沒有空閒對象時,才會創建新的slab。*/
  33. struct kmem_cache *
  34. kmem_cache_create (const char *name, size_t size, size_t align,
  35. unsigned long flags, void (*ctor)(void *))
  36. {
  37. size_t left_over, slab_size, ralign;
  38. struct kmem_cache *cachep = NULL, *pc;
  39. gfp_t gfp;
  40. /*
  41. * Sanity checks... these are all serious usage bugs.
  42. *//* 安全性檢查 */
  43. if (!name || in_interrupt() || (size < BYTES_PER_WORD) ||
  44. size > KMALLOC_MAX_SIZE) {
  45. printk(KERN_ERR "%s: Early error in slab %s\n", __func__,
  46. name);
  47. BUG();
  48. }
  49. /*
  50. * We use cache_chain_mutex to ensure a consistent view of
  51. * cpu_online_mask as well. Please see cpuup_callback
  52. */
  53. /* slab分配器是否已經初始化好,如果是內核啟動階段
  54. ,則只有一個cpu執行slab分配器的初始化動作,無需加鎖,否則需要加鎖 */
  55. if (slab_is_available()) {
  56. get_online_cpus();
  57. mutex_lock(&cache_chain_mutex);
  58. }
  59. /* 遍歷cache鏈,做些校驗工作 */
  60. list_for_each_entry(pc, &cache_chain, next) {
  61. char tmp;
  62. int res;
  63. /*
  64. * This happens when the module gets unloaded and doesn't
  65. * destroy its slab cache and no-one else reuses the vmalloc
  66. * area of the module. Print a warning.
  67. */
  68. /* 檢查cache鏈表中的cache是否都有名字 */
  69. res = probe_kernel_address(pc->name, tmp);
  70. if (res) {/*沒有名字,報錯*/
  71. printk(KERN_ERR
  72. "SLAB: cache with size %d has lost its name\n",
  73. pc->buffer_size);
  74. continue;
  75. }
  76. /* 檢查cache鏈表中是否已經存在相同名字的cache */
  77. if (!strcmp(pc->name, name)) {
  78. printk(KERN_ERR
  79. "kmem_cache_create: duplicate cache %s\n", name);
  80. dump_stack();
  81. goto oops;
  82. }
  83. }
  84. #if DEBUG
  85. WARN_ON(strchr(name, ' ')); /* It confuses parsers */
  86. #if FORCED_DEBUG
  87. /*
  88. * Enable redzoning and last user accounting, except for caches with
  89. * large objects, if the increased size would increase the object size
  90. * above the next power of two: caches with object sizes just above a
  91. * power of two have a significant amount of internal fragmentation.
  92. */
  93. if (size < 4096 || fls(size - 1) == fls(size-1 + REDZONE_ALIGN +
  94. 2 * sizeof(unsigned long long)))
  95. flags |= SLAB_RED_ZONE | SLAB_STORE_USER;
  96. if (!(flags & SLAB_DESTROY_BY_RCU))
  97. flags |= SLAB_POISON;
  98. #endif
  99. if (flags & SLAB_DESTROY_BY_RCU)
  100. BUG_ON(flags & SLAB_POISON);
  101. #endif
  102. /*
  103. * Always checks flags, a caller might be expecting debug support which
  104. * isn't available.
  105. */
  106. BUG_ON(flags & ~CREATE_MASK);
  107. /*
  108. * Check that size is in terms of words. This is needed to avoid
  109. * unaligned accesses for some archs when redzoning is used, and makes
  110. * sure any on-slab bufctl's are also correctly aligned.
  111. */
  112. if (size & (BYTES_PER_WORD - 1)) {
  113. size += (BYTES_PER_WORD - 1);
  114. size &= ~(BYTES_PER_WORD - 1);
  115. }
  116. /* calculate the final buffer alignment: */
  117. /* 1) arch recommendation: can be overridden for debug */
  118. if (flags & SLAB_HWCACHE_ALIGN) {
  119. /*
  120. * Default alignment: as specified by the arch code. Except if
  121. * an object is really small, then squeeze multiple objects into
  122. * one cacheline.
  123. */
  124. ralign = cache_line_size();
  125. while (size <= ralign / 2)
  126. ralign /= 2;
  127. } else {
  128. ralign = BYTES_PER_WORD;
  129. }
  130. /*
  131. * Redzoning and user store require word alignment or possibly larger.
  132. * Note this will be overridden by architecture or caller mandated
  133. * alignment if either is greater than BYTES_PER_WORD.
  134. */
  135. if (flags & SLAB_STORE_USER)
  136. ralign = BYTES_PER_WORD;
  137. if (flags & SLAB_RED_ZONE) {
  138. ralign = REDZONE_ALIGN;
  139. /* If redzoning, ensure that the second redzone is suitably
  140. * aligned, by adjusting the object size accordingly. */
  141. size += REDZONE_ALIGN - 1;
  142. size &= ~(REDZONE_ALIGN - 1);
  143. }
  144. /* 2) arch mandated alignment */
  145. if (ralign < ARCH_SLAB_MINALIGN) {
  146. ralign = ARCH_SLAB_MINALIGN;
  147. }
  148. /* 3) caller mandated alignment */
  149. if (ralign < align) {
  150. ralign = align;
  151. }
  152. /* disable debug if necessary */
  153. if (ralign > __alignof__(unsigned long long))
  154. flags &= ~(SLAB_RED_ZONE | SLAB_STORE_USER);
  155. /*
  156. * 4) Store it.
  157. */
  158. align = ralign;
  159. /* slab分配器是否已經可用 */
  160. if (slab_is_available())
  161. gfp = GFP_KERNEL;
  162. else
  163. /* slab初始化好之前,不允許阻塞,且只能在低端內存區分配 */
  164. gfp = GFP_NOWAIT;
  165. /* Get cache's description obj. */
  166. /* 獲得struct kmem_cache對象 ,為什麼能從cache中獲得的對象是
  167. kmem_cache結構呢,因為這裡的全局變量cache_cache的對象大小
  168. 就是kmem_cache結構大小*/
  169. cachep = kmem_cache_zalloc(&cache_cache, gfp);
  170. if (!cachep)
  171. goto oops;
  172. #if DEBUG
  173. cachep->obj_size = size;
  174. /*
  175. * Both debugging options require word-alignment which is calculated
  176. * into align above.
  177. */
  178. if (flags & SLAB_RED_ZONE) {
  179. /* add space for red zone words */
  180. cachep->obj_offset += sizeof(unsigned long long);
  181. size += 2 * sizeof(unsigned long long);
  182. }
  183. if (flags & SLAB_STORE_USER) {
  184. /* user store requires one word storage behind the end of
  185. * the real object. But if the second red zone needs to be
  186. * aligned to 64 bits, we must allow that much space.
  187. */
  188. if (flags & SLAB_RED_ZONE)
  189. size += REDZONE_ALIGN;
  190. else
  191. size += BYTES_PER_WORD;
  192. }
  193. #if FORCED_DEBUG && defined(CONFIG_DEBUG_PAGEALLOC)
  194. if (size >= malloc_sizes[INDEX_L3 + 1].cs_size
  195. && cachep->obj_size > cache_line_size() && size < PAGE_SIZE) {
  196. cachep->obj_offset += PAGE_SIZE - size;
  197. size = PAGE_SIZE;
  198. }
  199. #endif
  200. #endif
  201. /*
  202. * Determine if the slab management is 'on' or 'off' slab.
  203. * (bootstrapping cannot cope with offslab caches so don't do
  204. * it too early on.)
  205. */
  206. /* 確定slab管理對象的存儲方式:內置還是外置
  207. 。通常,當對象大於等於512時,使用外置方式
  208. 。初始化階段采用內置式。
  209. slab_early_init 參見kmem_cache_init函數 */
  210. if ((size >= (PAGE_SIZE >> 3)) && !slab_early_init)
  211. /*
  212. * Size is large, assume best to place the slab management obj
  213. * off-slab (should allow better packing of objs).
  214. */
  215. flags |= CFLGS_OFF_SLAB;
  216. size = ALIGN(size, align);
  217. /* 獲得slab中碎片的大小 */
  218. left_over = calculate_slab_order(cachep, size, align, flags);
  219. /* cachep->num為該cache中每個slab的對象數,為0,表示為該對象創建cache失敗 */
  220. if (!cachep->num) {
  221. printk(KERN_ERR
  222. "kmem_cache_create: couldn't create cache %s.\n", name);
  223. kmem_cache_free(&cache_cache, cachep);
  224. cachep = NULL;
  225. goto oops;
  226. }
  227. /* 計算slab管理對象的大小,包括struct slab對象和kmem_bufctl_t數組 */
  228. slab_size = ALIGN(cachep->num * sizeof(kmem_bufctl_t)
  229. + sizeof(struct slab), align);
  230. /*
  231. * If the slab has been placed off-slab, and we have enough space then
  232. * move it on-slab. This is at the expense of any extra colouring.
  233. */
  234. /* 如果這是一個外置式slab,並且碎片大小大於slab管理對象的大小
  235. ,則可將slab管理對象移到slab中,改造成一個內置式slab */
  236. if (flags & CFLGS_OFF_SLAB && left_over >= slab_size) {
  237. /* 除去off-slab標志位 */
  238. flags &= ~CFLGS_OFF_SLAB;
  239. /* 更新碎片大小 */
  240. left_over -= slab_size;
  241. }
  242. if (flags & CFLGS_OFF_SLAB) {
  243. /* really off slab. No need for manual alignment */
  244. /* align是針對slab對象的,如果slab管理對象是外置存儲
  245. ,自然不會像內置那樣影響到後面slab對象的存儲位置
  246. ,也就不需要對齊了 */
  247. slab_size =
  248. cachep->num * sizeof(kmem_bufctl_t) + sizeof(struct slab);
  249. #ifdef CONFIG_PAGE_POISONING
  250. /* If we're going to use the generic kernel_map_pages()
  251. * poisoning, then it's going to smash the contents of
  252. * the redzone and userword anyhow, so switch them off.
  253. */
  254. if (size % PAGE_SIZE == 0 && flags & SLAB_POISON)
  255. flags &= ~(SLAB_RED_ZONE | SLAB_STORE_USER);
  256. #endif
  257. }
  258. /* cache的著色塊的單位大小 */
  259. cachep->colour_off = cache_line_size();
  260. /* Offset must be a multiple of the alignment. */
  261. /* 著色塊大小必須是對象要求對齊方式的倍數 */
  262. if (cachep->colour_off < align)
  263. cachep->colour_off = align;
  264. /* 計算碎片區需要多少個著色快 */
  265. cachep->colour = left_over / cachep->colour_off;
  266. /* slab管理對象的大小 */
  267. cachep->slab_size = slab_size;
  268. cachep->flags = flags;
  269. cachep->gfpflags = 0;
  270. if (CONFIG_ZONE_DMA_FLAG && (flags & SLAB_CACHE_DMA))
  271. cachep->gfpflags |= GFP_DMA;
  272. /* slab對象的大小 */
  273. cachep->buffer_size = size;
  274. /* 計算對象在slab中索引時用,參見obj_to_index函數 */
  275. cachep->reciprocal_buffer_size = reciprocal_value(size);
  276. if (flags & CFLGS_OFF_SLAB) {
  277. /* 分配一個slab管理區域對象,保存在slabp_cache中,
  278. 這個函數傳入的大小為slab_size,也就是分配slab_size大小的cache
  279. ,在slab創建的時候如果是外置式,那麼需要從分配的這裡面
  280. 分配出slab對象,剩下的空間放kmem_bufctl_t[]數組,
  281. 如果是內置式的slab,此指針為空 */
  282. cachep->slabp_cache = kmem_find_general_cachep(slab_size, 0u);
  283. /*
  284. * This is a possibility for one of the malloc_sizes caches.
  285. * But since we go off slab only for object size greater than
  286. * PAGE_SIZE/8, and malloc_sizes gets created in ascending order,
  287. * this should not happen at all.
  288. * But leave a BUG_ON for some lucky dude.
  289. */
  290. BUG_ON(ZERO_OR_NULL_PTR(cachep->slabp_cache));
  291. }
  292. cachep->ctor = ctor;
  293. cachep->name = name;
  294. /* 設置每個cpu上的local cache */
  295. if (setup_cpu_cache(cachep, gfp)) {
  296. __kmem_cache_destroy(cachep);
  297. cachep = NULL;
  298. goto oops;
  299. }
  300. /* cache setup completed, link it into the list */
  301. /* cache創建完畢,將其加入到全局slab cache鏈表中 */
  302. list_add(&cachep->next, &cache_chain);
  303. oops:
  304. if (!cachep && (flags & SLAB_PANIC))
  305. panic("kmem_cache_create(): failed to create slab `%s'\n",
  306. name);
  307. if (slab_is_available()) {
  308. mutex_unlock(&cache_chain_mutex);
  309. put_online_cpus();
  310. }
  311. return cachep;
  312. }

其中,cache_cache

[cpp]
  1. /* internal cache of cache description objs */
  2. static struct kmem_cache cache_cache = {
  3. .batchcount = 1,
  4. .limit = BOOT_CPUCACHE_ENTRIES,
  5. .shared = 1,
  6. .buffer_size = sizeof(struct kmem_cache),/*大小為cache結構,難怪名稱為cache_cache*/
  7. .name = "kmem_cache",
  8. };
Copyright © Linux教程網 All Rights Reserved