霍普菲尔德网络以及玻尔兹曼机周报
问题
1.上周探讨的单层感知机,双层以及多层神经网络本质区别。
隐藏层的数量?
2.我们可以将 P 的温度依赖性移入指数部分,并将底数改为 e(我们知道任何正数 x 都可以表示为 e ln ( x ) e^{\ln(x)} e ln ( x ) )。我们定义− ln [ p ] ϵ = T -\frac{\ln[p]}{\epsilon} = T − ϵ ln [ p ] = T 。
为什么这个可以代表温度?
玻尔兹曼机的相关内容
详细看网站:玻尔兹曼机 | moshiqiqian
霍普菲尔德网络问题探讨
1.赫布学习法不分正反
赫布学习(Hebbian learning)的核心思想是“一同激发的神经元,会连接在一起 ”(“Neurons that fire together, wire together”)。它会无差别地强化那些同时活跃的神经元之间的连接,而不会“判断”这些连接是好是坏,或者学习到的模式本身是否“理想” 。
赫布学习不分正反意味着赫布学习规则在更新权重时,不会区分模式的方向。
这意味着在如果模式P被存储,那么模式-P也会被存储,而且,在赫布学习下,这两个模式在权重更新时是等价的。体现在霍普菲尔德网络中,实际的结果就会产生,生成与目标模式完全相反的情况。
下面通过代码来展现
1 2 import numpy as npimport matplotlib.pyplot as plt
1 2 3 4 5 6 7 8 def train (patterns ): num_neurno = patterns[0 ].shape[0 ] w = np.zeros((num_neurno,num_neurno)) for pattern in patterns: pattern = pattern.reshape(-1 ,1 ) w += np.dot(pattern,pattern.T) np.fill_diagonal(w,0 ) return w
1 2 3 def E (state,w ): e = -0.5 *np.dot(state.T,np.dot(w,state)) return e
1 2 3 4 5 6 7 8 9 10 11 12 def update (state,w,index ): net_input = np.dot(w[index,:],state) if net_input > 0 : new_state = 1 elif net_input <0 : new_state = -1 else : new_state = state[index] changed = (new_state != state[index]) state[index]= new_state return state,changed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 def work (initial_state,w,n = 1000 ): current_state = np.copy(initial_state) e_history = [E(initial_state,w)] num_neuron = initial_state.shape[0 ] state_history = [np.copy(initial_state)] for i in range (n): index = np.random.randint(num_neuron) current_state,changed = update(current_state,w,index) e_history.append(E(current_state, w)) state_history.append(np.copy(current_state)) return current_state, e_history, state_history
1 2 def create_random_pattern (num_neurons ): return np.random.choice([-1 , 1 ], size=num_neurons)
1 2 3 4 5 6 7 8 9 def add_noise (pattern, noise_level=0.1 ): noisy_pattern = np.copy(pattern) num_to_flip = int (np.round (len (pattern) * noise_level)) if num_to_flip == 0 and noise_level > 0 and len (pattern) > 0 : num_to_flip = 1 flip_indices = np.random.choice(len (pattern), num_to_flip, replace=False ) noisy_pattern[flip_indices] *= -1 return noisy_pattern
1 2 def get_overlap (state1, state2 ): return np.dot(state1, state2) / len (state1)
1 2 3 4 5 6 7 num_neurons = 10 max_iterations = 500 print (f"--- 霍普菲尔德网络可视化 ({num_neurons} 个神经元) ---" )print (f"理论容量上限约: {0.138 * num_neurons:.2 f} 个模式" )--- 霍普菲尔德网络可视化 (10 个神经元) --- 理论容量上限约: 1.38 个模式
1 2 plt.rcParams['font.sans-serif' ] = ['SimHei' ] plt.rcParams['axes.unicode_minus' ] = False
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 print ("\n--- 实验1: 赫布学习的正面情况 (成功召回) ---" )num_patterns_recall = 2 np.random.seed(42 ) stored_patterns_recall = [create_random_pattern(num_neurons) for _ in range (num_patterns_recall)] print (f"存储了 {num_patterns_recall} 个模式:" )for i, p in enumerate (stored_patterns_recall): print (f" 模式 {i+1 } : {p} " ) w_recall = train(stored_patterns_recall) plt.figure(figsize=(14 , 6 )) for i, original_pattern in enumerate (stored_patterns_recall): noise_level = 0.2 initial_state = add_noise(original_pattern, noise_level) print (f"\n测试模式 {i+1 } :" ) print (f" 初始状态 (加{int (noise_level*100 )} %噪声): {initial_state} " ) final_state_recall, e_history_recall, state_history_recall = work(initial_state, w_recall, n=max_iterations) overlaps_recall = [get_overlap(original_pattern, s) for s in state_history_recall] plt.subplot(1 , 2 , 1 ) plt.plot(e_history_recall, label=f'模式 {i+1 } 能量' ) plt.title('能量演化 (成功召回)' ) plt.xlabel('迭代次数' ) plt.ylabel('能量 (E)' ) plt.grid(True ) plt.subplot(1 , 2 , 2 ) plt.plot(overlaps_recall, label=f'模式 {i+1 } 重叠度' ) plt.title('与原始模式重叠度演化 (成功召回)' ) plt.xlabel('迭代次数' ) plt.ylabel('重叠度' ) plt.grid(True ) plt.ylim(-1.1 , 1.1 ) plt.subplot(1 , 2 , 1 ) plt.legend() plt.subplot(1 , 2 , 2 ) plt.legend() plt.tight_layout() plt.show() --- 实验1 : 赫布学习的正面情况 (成功召回) --- 存储了 2 个模式: 模式 1 : [-1 1 -1 -1 -1 1 -1 -1 -1 1 ] 模式 2 : [-1 -1 -1 -1 1 -1 1 1 1 -1 ] 测试模式 1 : 初始状态 (加20 %噪声): [-1 1 1 -1 -1 1 -1 -1 1 1 ] 测试模式 2 : 初始状态 (加20 %噪声): [-1 -1 1 -1 1 -1 1 -1 1 -1 ]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 print ("\n" + "=" * 50 )print ("--- 实验2: 赫布学习的反面情况 (存储容量限制) ---" )num_patterns_capacity = 20 np.random.seed(43 ) stored_patterns_capacity = [create_random_pattern(num_neurons) for _ in range (num_patterns_capacity)] print (f"尝试存储 {num_patterns_capacity} 个模式 (超过 {num_neurons} 神经元理论容量 {0.138 * num_neurons:.2 f} ):" )w_capacity = train(stored_patterns_capacity) plt.figure(figsize=(14 , 6 )) test_pattern_idx = 0 original_pattern_test = stored_patterns_capacity[test_pattern_idx] noise_level_capacity = 0.05 initial_state_capacity = add_noise(original_pattern_test, noise_level_capacity) print (f"\n测试存储模式 {test_pattern_idx+1 } (加{int (noise_level_capacity*100 )} %噪声):" )print (f" 初始重叠度: {get_overlap(original_pattern_test, initial_state_capacity):.2 f} " )final_state_capacity, e_history_capacity, state_history_capacity = work(initial_state_capacity, w_capacity, n=max_iterations) overlaps_capacity = [get_overlap(original_pattern_test, s) for s in state_history_capacity] print (f" 最终状态与原始模式重叠度: {get_overlap(original_pattern_test, final_state_capacity):.2 f} " )is_correctly_recalled = False for j, stored_p in enumerate (stored_patterns_capacity): if np.array_equal(stored_p, final_state_capacity) or np.array_equal(stored_p, -final_state_capacity): print (f"最终状态与存储模式 {j+1 } 匹配。" ) is_correctly_recalled = True break if not is_correctly_recalled: print ("未能召回任何存储模式,很可能收敛到虚假态。" ) plt.subplot(1 , 2 , 1 ) plt.plot(e_history_capacity, label=f'容量测试能量' ) plt.title('能量演化 (容量限制)' ) plt.xlabel('迭代次数' ) plt.ylabel('能量 (E)' ) plt.grid(True ) plt.subplot(1 , 2 , 2 ) plt.plot(overlaps_capacity, label=f'与测试模式重叠度' ) plt.title('重叠度演化 (容量限制)' ) plt.xlabel('迭代次数' ) plt.ylabel('重叠度' ) plt.grid(True ) plt.ylim(-1.1 , 1.1 ) plt.subplot(1 , 2 , 1 ) plt.legend() plt.subplot(1 , 2 , 2 ) plt.legend() plt.tight_layout() plt.show() ================================================== --- 实验2 : 赫布学习的反面情况 (存储容量限制) --- 尝试存储 20 个模式 (超过 10 神经元理论容量 1.38 ): 测试存储模式 1 (加5 %噪声): 初始重叠度: 0.80 最终状态与原始模式重叠度: 0.00 最终状态与存储模式 3 匹配。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 print ("\n" + "=" * 50 )print ("--- 实验3: 赫布学习的反面情况 (收敛到虚假态) ---" )num_patterns_spurious = 3 np.random.seed(44 ) stored_patterns_spurious = [create_random_pattern(num_neurons) for _ in range (num_patterns_spurious)] print (f"存储了 {num_patterns_spurious} 个模式:" )for i, p in enumerate (stored_patterns_spurious): print (f" 模式 {i+1 } : {p} " ) w_spurious = train(stored_patterns_spurious) random_initial_state = create_random_pattern(num_neurons) print (f"\n使用随机初始状态:" )print (f" 随机初始状态: {random_initial_state} " )plt.figure(figsize=(14 , 6 )) final_state_spurious, e_history_spurious, state_history_spurious = work(random_initial_state, w_spurious, n=max_iterations) initial_overlaps_with_stored = [get_overlap(p, random_initial_state) for p in stored_patterns_spurious] closest_initial_idx = np.argmax(np.abs (initial_overlaps_with_stored)) target_pattern_for_overlap = stored_patterns_spurious[closest_initial_idx] overlaps_spurious = [get_overlap(target_pattern_for_overlap, s) for s in state_history_spurious] print (f" 与初始最接近的存储模式 {closest_initial_idx+1 } 的重叠度: {get_overlap(target_pattern_for_overlap, random_initial_state):.2 f} " )print (f" 最终状态: {final_state_spurious} " )is_stored_pattern = False for i, p in enumerate (stored_patterns_spurious): if np.array_equal(p, final_state_spurious) or np.array_equal(p, -final_state_spurious): print (f"最终状态收敛到存储模式 {i+1 } 或其反向。" ) is_stored_pattern = True break if not is_stored_pattern: print (" 最终状态不是任何存储模式,很可能是一个虚假态。" ) print (" 请注意,能量通常会下降到局部最小值。" ) plt.subplot(1 , 2 , 1 ) plt.plot(e_history_spurious, label=f'虚假态能量' ) plt.title('能量演化 (虚假态)' ) plt.xlabel('迭代次数' ) plt.ylabel('能量 (E)' ) plt.grid(True ) plt.legend() plt.subplot(1 , 2 , 2 ) plt.plot(overlaps_spurious, label=f'与最接近存储模式 {closest_initial_idx+1 } 的重叠度' ) plt.title('重叠度演化 (虚假态)' ) plt.xlabel('迭代次数' ) plt.ylabel('重叠度' ) plt.grid(True ) plt.ylim(-1.1 , 1.1 ) plt.legend() plt.tight_layout() plt.show() print ("-" * 40 )================================================== --- 实验3 : 赫布学习的反面情况 (收敛到虚假态) --- 存储了 3 个模式: 模式 1 : [-1 1 1 1 1 1 -1 -1 1 1 ] 模式 2 : [-1 -1 1 1 -1 1 1 1 1 1 ] 模式 3 : [-1 1 1 -1 1 1 1 -1 -1 1 ] 使用随机初始状态: 随机初始状态: [-1 1 1 1 -1 -1 -1 1 1 -1 ] 与初始最接近的存储模式 3 的重叠度: -0.40 最终状态: [ 1 -1 -1 1 -1 -1 -1 1 1 -1 ] 最终状态收敛到存储模式 3 或其反向。
在实验二中,我们创建了20个模式,这个数量远超了霍普菲尔德网络的理论存储上限 。因此,网络在学习过程中表现不佳,未能收敛到任何预期的模式。
而在实验三中,我们采用了完全随机生成 的初始状态,而非从已存储的模式中选取并施加噪声。在这种情况下,尽管输入与任何已知模式都不相似,但代码依然收敛到一个稳定的状态 。这个状态与我们预期的存储模式截然相反,这意味着网络进入了一个虚假态 。