继续研读JDK的源码,在比较HashMap
和ConcurrentHashMap
的不同之处发现了一个细节——关于Iterator
的实现的不同,其实HashMap
和ConcurrentHashMap
还有更多不同的地方,这也是面试经常问到的问题,有一篇文章我觉得讲的很好了,Java进阶(六)从ConcurrentHashMap的演进看Java多线程核心技术。
Iterator是一种设计模式,在Java Collection Framework
中经常作为容器的视图(view),大多数时候只支持删除、不支持增加,提供统一的接口方法等特点。在Java Collection Framework
的Iterator
实现中大多数是fast-fail
方式的,而支持并发的容器数据结构则没有这个限制。
非并发数据结构的情况
常见的使用方法
1)使用Iterator遍历字符串列表
List<String> lists = Arrays.asList("a","b","c");Iterator<String> iterator = lists.iterator();while (iterator.hasNext()) { String val = iterator.next(); System.out.println(val);}
这种做法是for..each的语法的展开形式
for(String val: lists){ //sout}
2)使用Iterator遍历LinkedList
LinkedList<String> linkedList = new LinkedList<>(lists);iterator = linkedList.iterator();while (iterator.hasNext()) { String val = iterator.next(); System.out.println(val);}
3) 使用Iterator遍历HashMap
Map<String,Integer> hmap = new HashMap<>(3);hmap.put("a",1);hmap.put("b",2);hmap.put("c",3);
Iterator<Map.Entry<String,Integer>> mapIterator = hmap.entrySet().iterator();while (mapIterator.hasNext()) { Map.Entry<String,Integer> entry = mapIterator.next(); System.out.println(entry.getKey() + ":" + entry.getValue());}
非并发数据结构Iterator的实现
1)ArrayList中的Iterator
list中的结构是顺序的,Iterator既然是List的视图,那它也表现了相同的顺序。 ArrayList获得Iterator,
/*** Returns an iterator over the elements in this list in proper sequence.** <p>The returned iterator is <a href="#fail-fast"><i>fail-fast</i></a>.** @return an iterator over the elements in this list in proper sequence*/public Iterator<E> iterator() { return new Itr();}
源码,
/*** An optimized version of AbstractList.Itr*/private class Itr implements Iterator<E> { int cursor; // index of next element to return int lastRet = -1; // index of last element returned; -1 if no such int expectedModCount = modCount;
public boolean hasNext() { return cursor != size; }
@SuppressWarnings("unchecked") public E next() { checkForComodification(); int i = cursor; if (i >= size) throw new NoSuchElementException(); Object[] elementData = ArrayList.this.elementData; if (i >= elementData.length) throw new ConcurrentModificationException(); cursor = i + 1; return (E) elementData[lastRet = i]; }
public void remove() { if (lastRet < 0) throw new IllegalStateException(); checkForComodification();
try { ArrayList.this.remove(lastRet); cursor = lastRet; lastRet = -1; expectedModCount = modCount; } catch (IndexOutOfBoundsException ex) { throw new ConcurrentModificationException(); } }
final void checkForComodification() { if (modCount != expectedModCount) throw new ConcurrentModificationException(); }}
Itr
是ArrayList
的一个内部类,它能使用宿主类的成员变量,事实上Itr
反映了ArrayList的内部情况,使用了size
、expectedModCount
和elementData
等属性。通过游标cursor的方式不断往前递进,只要游标小于size就说明依然还有元素可以访问。
应该看到的是,在调用了new Iterator()
之后,可以看做Itr
对ArrayList
做了快照,这里的快照并不是很严格,是基于modCount
比较来实现的。它在初始化时备份了modCount
的值,保存为私有的变量expectedModCount
。
首先Iterator
接口并没有诸如add的方法,即不能通过Iterator来为容器增加元素;
其次,如果有其他线程变化了容器的结构(structural modification),那么ArrayList.this.modCount
的值会发生改变,那么在Itr
执行next或者remove方法时会判断出来modCount != expectedModCount
的情况,从而抛出异常fast-fail
。
再次,如果执行了Itr
的remove方法,它能够调用ArrayList.this.remove
的方法,然后修正游标和expectedModCount
等。
ArrayList.this.remove(lastRet);cursor = lastRet;lastRet = -1;expectedModCount = modCount;
2)LinkedList中的Iterator
LinkedList
的Iterator
和ArrayList
中的有一些类似的地方。
首先,LinkedList
的iterator入口方法其实是AbstractSequentialList
抽象类中,
/*** Returns an iterator over the elements in this list (in proper* sequence).<p>** This implementation merely returns a list iterator over the list.** @return an iterator over the elements in this list (in proper sequence)*/public Iterator<E> iterator() { return listIterator();}
/*** Returns a list iterator over the elements in this list (in proper* sequence).** @param index index of first element to be returned from the list* iterator (by a call to the <code>next</code> method)* @return a list iterator over the elements in this list (in proper* sequence)* @throws IndexOutOfBoundsException {@inheritDoc}*/public abstract ListIterator<E> listIterator(int index);
而这个ListIterator
是一个接口,它被LinkedList$ListItr
实现,
private class ListItr implements ListIterator<E> { private Node<E> lastReturned = null; private Node<E> next; private int nextIndex; private int expectedModCount = modCount;
ListItr(int index) { // assert isPositionIndex(index); next = (index == size) ? null : node(index); nextIndex = index; }
public boolean hasNext() { return nextIndex < size; }
public E next() { checkForComodification(); if (!hasNext()) throw new NoSuchElementException();
lastReturned = next; next = next.next; nextIndex++; return lastReturned.item; }
public boolean hasPrevious() { return nextIndex > 0; }
public E previous() { checkForComodification(); if (!hasPrevious()) throw new NoSuchElementException();
lastReturned = next = (next == null) ? last : next.prev; nextIndex--; return lastReturned.item; }
public int nextIndex() { return nextIndex; }
public int previousIndex() { return nextIndex - 1; }
public void remove() { checkForComodification(); if (lastReturned == null) throw new IllegalStateException();
Node<E> lastNext = lastReturned.next; unlink(lastReturned); if (next == lastReturned) next = lastNext; else nextIndex--; lastReturned = null; expectedModCount++; }
public void set(E e) { if (lastReturned == null) throw new IllegalStateException(); checkForComodification(); lastReturned.item = e; }
public void add(E e) { checkForComodification(); lastReturned = null; if (next == null) linkLast(e); else linkBefore(e, next); nextIndex++; expectedModCount++; }
final void checkForComodification() { if (modCount != expectedModCount) throw new ConcurrentModificationException(); }}
LinkedList
的Iterator
要比ArrayList
中的复杂一些,它更支持了add等方法;
类似原来游标的遍历方式,基于size
、expectedModCount
等比较逻辑依然存在,只不过遍历的方式不是原来的下标增进,而是节点之间的next指针来实现。
3)HashMap中的Iterator
HashMap
有多个view视图,keySet
, values
, entrySet
,这里分析下entrySet
这个视图,另外两个原理和entrySet
视图的差不多。
private final class EntrySet extends AbstractSet<Map.Entry<K,V>> { public Iterator<Map.Entry<K,V>> iterator() { return newEntryIterator(); } public boolean contains(Object o) { if (!(o instanceof Map.Entry)) return false; Map.Entry<K,V> e = (Map.Entry<K,V>) o; Entry<K,V> candidate = getEntry(e.getKey()); return candidate != null && candidate.equals(e); } public boolean remove(Object o) { return removeMapping(o) != null; } public int size() { return size; } public void clear() { HashMap.this.clear(); }}
EntrySet的iterator方法中调用了newEntryIterator
,将构造EntryIterator
实例,
EntryIterator
源码
private final class EntryIterator extends HashIterator<Map.Entry<K,V>> { public Map.Entry<K,V> next() { return nextEntry(); }}
EntryIterator
继承了HashIterator
类,复用了父类的大部分方法,只是覆盖了next方法。
HashIterator
源码,
private abstract class HashIterator<E> implements Iterator<E> { Entry<K,V> next; // next entry to return int expectedModCount; // For fast-fail int index; // current slot Entry<K,V> current; // current entry
HashIterator() { expectedModCount = modCount; if (size > 0) { // advance to first entry Entry[] t = table; while (index < t.length && (next = t[index++]) == null) ; } }
public final boolean hasNext() { return next != null; }
final Entry<K,V> nextEntry() { if (modCount != expectedModCount) throw new ConcurrentModificationException(); Entry<K,V> e = next; if (e == null) throw new NoSuchElementException();
if ((next = e.next) == null) { Entry[] t = table; while (index < t.length && (next = t[index++]) == null) ; } current = e; return e; }
public void remove() { if (current == null) throw new IllegalStateException(); if (modCount != expectedModCount) throw new ConcurrentModificationException(); Object k = current.key; current = null; HashMap.this.removeEntryForKey(k); expectedModCount = modCount; }}
由于HashMap的结构并不是顺序的,在执行Iterator.next方法时不能通过next指针或下标的方式直接找到下一个元素,HashIterator
为了能达到这个目的,在构造函数和nextEntry
方法中预先做了advance
处理。
//构造函数中if (size > 0) { // advance to first entry Entry[] t = table; while (index < t.length && (next = t[index++]) == null) ;}//nextEntry中if ((next = e.next) == null) { Entry[] t = table; while (index < t.length && (next = t[index++]) == null) ;}
构造函数中预先在HashMap的table数组找到第一个头结点不为null的元素;
(next = t[index++]) == null
的写法有点迷惑性,不考虑HashMap为空的情况,index自增停在next != null
的情况,即 next = t[index-1], index已经往前一步了;
在nextEntry中如果发现e.next是null,此时表示table这个数组元素的链表遍历结束了,需要跳到下一个头节点不为空的元素继续遍历,而index刚好往前一步了,此时继续执行
next = t[index++]
假设next[index]不为空,那么下一个遍历的数组元素头节点找到,并且index已经自增了。
并发数据结构的情况
以ConcurrentHashMap
为例,看ConcurrentHashMap$HashInteraotr
的实现
abstract class HashIterator { int nextSegmentIndex; int nextTableIndex; HashEntry<K,V>[] currentTable; HashEntry<K, V> nextEntry; HashEntry<K, V> lastReturned;
HashIterator() { nextSegmentIndex = segments.length - 1; nextTableIndex = -1; advance(); }
/** * Set nextEntry to first node of next non-empty table * (in backwards order, to simplify checks). */ final void advance() { for (;;) { if (nextTableIndex >= 0) { if ((nextEntry = entryAt(currentTable, nextTableIndex--)) != null) break; } else if (nextSegmentIndex >= 0) { Segment<K,V> seg = segmentAt(segments, nextSegmentIndex--); if (seg != null && (currentTable = seg.table) != null) nextTableIndex = currentTable.length - 1; } else break; } }
final HashEntry<K,V> nextEntry() { HashEntry<K,V> e = nextEntry; if (e == null) throw new NoSuchElementException(); lastReturned = e; // cannot assign until after null check if ((nextEntry = e.next) == null) advance(); return e; }
public final boolean hasNext() { return nextEntry != null; } public final boolean hasMoreElements() { return nextEntry != null; }
public final void remove() { if (lastReturned == null) throw new IllegalStateException(); ConcurrentHashMap.this.remove(lastReturned.key); lastReturned = null; }}
这里能看到ConcurrentHashMap的segment分段因素所在,在构造函数中指定了最后一个segment数组元素,然后做advance处理,也是从后往前处理的。首先找到不为null的分段segment,然后才是在segment的table数组中找到不为null的元素,这都是从后往前“前进”的。
而与HashMap不同的地方,ConcurrentHashMap的Iterator并不是fast-fail
的,它并没有判断modCount;除此之外还应该看到它对nextEntry
的处理,在advance的方法调用以下两个方法,
/*** Gets the jth element of given segment array (if nonnull) with* volatile element access semantics via Unsafe. (The null check* can trigger harmlessly only during deserialization.) Note:* because each element of segments array is set only once (using* fully ordered writes), some performance-sensitive methods rely* on this method only as a recheck upon null reads.*/@SuppressWarnings("unchecked")static final <K,V> Segment<K,V> segmentAt(Segment<K,V>[] ss, int j) { long u = (j << SSHIFT) + SBASE; return ss == null ? null : (Segment<K,V>) UNSAFE.getObjectVolatile(ss, u);}/*** Gets the ith element of given table (if nonnull) with volatile* read semantics. Note: This is manually integrated into a few* performance-sensitive methods to reduce call overhead.*/@SuppressWarnings("unchecked")static final <K,V> HashEntry<K,V> entryAt(HashEntry<K,V>[] tab, int i) { return (tab == null) ? null : (HashEntry<K,V>) UNSAFE.getObjectVolatile (tab, ((long)i << TSHIFT) + TBASE);}
它们都是调用了UNSAFE.getObjectVolatile
方法,利用了volatile access的方式,相较于上锁的方式性能更好。
番外篇
JavaScript实现的Iterator的例子
这个例子来自MDN的文档,做法比较简洁,迭代器
function makeIterator(array){ var nextIndex = 0;
return { next: function(){ return nextIndex < array.length ? {value: array[nextIndex++], done: false} : {done: true}; } };}var it = makeIterator(['yo', 'ya']);console.log(it.next().value); // 'yo'console.log(it.next().value); // 'ya'console.log(it.next().done); // true
可以考虑给这个makeIterator
的返回值加上hasNext
属性,
return { next: ..., hasNext: function() { return nextIndex < array.length; }}
JavaScript利用了闭包实现了Iterator和Java利用内部类实现有相似的地方。
总结
Iterator的主要目的还是为了表现底层数据结构的所有元素,提供一种统一的遍历方式。在不同的数据结构需要针对不同语义做出改动,像LinkedList
的支持add方法,像ConcurrentHashMap
和HashMap
的advance
处理,像ConcurrentHashMap
那样不判断modeCount
而使用volatile access
等。