List of usage examples for java.util NavigableMap descendingMap
NavigableMap<K, V> descendingMap();
From source file:Main.java
public static void main(String[] args) { NavigableMap<Integer, String> map = new TreeMap<Integer, String>(); map.put(2, "two"); map.put(1, "one"); map.put(3, "three"); System.out.println(map.descendingMap() + "\n"); }
From source file:Main.java
public static void main(String[] args) { NavigableMap<String, String> nMap = new TreeMap<>(); nMap.put("CSS", "style"); nMap.put("HTML", "mark up"); nMap.put("Oracle", "database"); nMap.put("XML", "data"); System.out.println("Navigable Map:" + nMap); Entry<String, String> lowerXML = nMap.lowerEntry("XML"); Entry<String, String> floorXML = nMap.floorEntry("XML"); Entry<String, String> higherXML = nMap.higherEntry("XML"); Entry<String, String> ceilingXML = nMap.ceilingEntry("XML"); System.out.println("Lower:" + lowerXML); System.out.println("Floor:" + floorXML); System.out.println("Higher:" + higherXML); System.out.println("Ceiling:" + ceilingXML); // Get the reverse order view of the map NavigableMap<String, String> reverseMap = nMap.descendingMap(); System.out.println("Navigable Map(Reverse Order):" + reverseMap); }
From source file:com.facebook.buck.util.unarchive.Untar.java
/** Set the modification times on directories that were directly specified in the archive */ private void setDirectoryModificationTimes(ProjectFilesystem filesystem, NavigableMap<Path, Long> dirToTime) { for (Map.Entry<Path, Long> pathAndTime : dirToTime.descendingMap().entrySet()) { File file = filesystem.getRootPath().resolve(pathAndTime.getKey()).toFile(); file.setLastModified(pathAndTime.getValue()); }//from w ww . ja v a 2 s .c o m }
From source file:com.wibidata.shopping.servlet.HomePageServlet.java
private List<ProductRating> getRecentRatings(KijiTable userTable, String login) throws IOException { KijiTableReader reader = userTable.openTableReader(); try {/*from w w w . ja va 2s .c o m*/ final long now = System.currentTimeMillis(); final long lastWeek = now - DateUtils.MILLIS_PER_DAY * 7L; EntityId entityId = userTable.getEntityId(login); KijiDataRequestBuilder drBuilder = KijiDataRequest.builder(); drBuilder.newColumnsDef().addFamily("rating"); drBuilder.withTimeRange(lastWeek, HConstants.LATEST_TIMESTAMP); KijiDataRequest dataRequest = drBuilder.build(); KijiRowData row = reader.get(entityId, dataRequest); if (row.containsColumn("rating")) { NavigableMap<String, NavigableMap<Long, ProductRating>> ratingsMap = row.getValues("rating"); NavigableMap<Long, ProductRating> sortedRatings = new TreeMap<Long, ProductRating>(); for (NavigableMap<Long, ProductRating> value : ratingsMap.values()) { for (NavigableMap.Entry<Long, ProductRating> entry : value.entrySet()) { sortedRatings.put(entry.getKey(), entry.getValue()); } } return new ArrayList<ProductRating>(sortedRatings.descendingMap().values()); } return null; } catch (KijiDataRequestException e) { throw new IOException(e); } finally { IOUtils.closeQuietly(reader); } }
From source file:cx.ring.service.LocalService.java
public void readConversation(Conversation conv) { for (HistoryEntry h : conv.getRawHistory().values()) { NavigableMap<Long, TextMessage> messages = h.getTextMessages(); for (TextMessage msg : messages.descendingMap().values()) { if (msg.isRead()) break; readTextMessage(msg);//ww w . j av a 2 s . co m } } notificationManager.cancel(conv.notificationId); updateTextNotifications(); }
From source file:org.apache.hadoop.hbase.master.balancer.DefaultLoadBalancer.java
/** * Generate a global load balancing plan according to the specified map of * server information to the most loaded regions of each server. * * The load balancing invariant is that all servers are within 1 region of the * average number of regions per server. If the average is an integer number, * all servers will be balanced to the average. Otherwise, all servers will * have either floor(average) or ceiling(average) regions. * * HBASE-3609 Modeled regionsToMove using Guava's MinMaxPriorityQueue so that * we can fetch from both ends of the queue. * At the beginning, we check whether there was empty region server * just discovered by Master. If so, we alternately choose new / old * regions from head / tail of regionsToMove, respectively. This alternation * avoids clustering young regions on the newly discovered region server. * Otherwise, we choose new regions from head of regionsToMove. * // w ww .j a v a 2 s . c o m * Another improvement from HBASE-3609 is that we assign regions from * regionsToMove to underloaded servers in round-robin fashion. * Previously one underloaded server would be filled before we move onto * the next underloaded server, leading to clustering of young regions. * * Finally, we randomly shuffle underloaded servers so that they receive * offloaded regions relatively evenly across calls to balanceCluster(). * * The algorithm is currently implemented as such: * * <ol> * <li>Determine the two valid numbers of regions each server should have, * <b>MIN</b>=floor(average) and <b>MAX</b>=ceiling(average). * * <li>Iterate down the most loaded servers, shedding regions from each so * each server hosts exactly <b>MAX</b> regions. Stop once you reach a * server that already has <= <b>MAX</b> regions. * <p> * Order the regions to move from most recent to least. * * <li>Iterate down the least loaded servers, assigning regions so each server * has exactly </b>MIN</b> regions. Stop once you reach a server that * already has >= <b>MIN</b> regions. * * Regions being assigned to underloaded servers are those that were shed * in the previous step. It is possible that there were not enough * regions shed to fill each underloaded server to <b>MIN</b>. If so we * end up with a number of regions required to do so, <b>neededRegions</b>. * * It is also possible that we were able to fill each underloaded but ended * up with regions that were unassigned from overloaded servers but that * still do not have assignment. * * If neither of these conditions hold (no regions needed to fill the * underloaded servers, no regions leftover from overloaded servers), * we are done and return. Otherwise we handle these cases below. * * <li>If <b>neededRegions</b> is non-zero (still have underloaded servers), * we iterate the most loaded servers again, shedding a single server from * each (this brings them from having <b>MAX</b> regions to having * <b>MIN</b> regions). * * <li>We now definitely have more regions that need assignment, either from * the previous step or from the original shedding from overloaded servers. * Iterate the least loaded servers filling each to <b>MIN</b>. * * <li>If we still have more regions that need assignment, again iterate the * least loaded servers, this time giving each one (filling them to * </b>MAX</b>) until we run out. * * <li>All servers will now either host <b>MIN</b> or <b>MAX</b> regions. * * In addition, any server hosting >= <b>MAX</b> regions is guaranteed * to end up with <b>MAX</b> regions at the end of the balancing. This * ensures the minimal number of regions possible are moved. * </ol> * * TODO: We can at-most reassign the number of regions away from a particular * server to be how many they report as most loaded. * Should we just keep all assignment in memory? Any objections? * Does this mean we need HeapSize on HMaster? Or just careful monitor? * (current thinking is we will hold all assignments in memory) * * @param clusterMap Map of regionservers and their load/region information to * a list of their most loaded regions * @return a list of regions to be moved, including source and destination, * or null if cluster is already balanced */ public List<RegionPlan> balanceCluster(Map<ServerName, List<HRegionInfo>> clusterMap) { boolean emptyRegionServerPresent = false; long startTime = System.currentTimeMillis(); ClusterLoadState cs = new ClusterLoadState(clusterMap); if (!this.needsBalance(cs)) return null; int numServers = cs.getNumServers(); NavigableMap<ServerAndLoad, List<HRegionInfo>> serversByLoad = cs.getServersByLoad(); int numRegions = cs.getNumRegions(); int min = numRegions / numServers; int max = numRegions % numServers == 0 ? min : min + 1; // Using to check balance result. StringBuilder strBalanceParam = new StringBuilder(); strBalanceParam.append("Balance parameter: numRegions=").append(numRegions).append(", numServers=") .append(numServers).append(", max=").append(max).append(", min=").append(min); LOG.debug(strBalanceParam.toString()); // Balance the cluster // TODO: Look at data block locality or a more complex load to do this MinMaxPriorityQueue<RegionPlan> regionsToMove = MinMaxPriorityQueue.orderedBy(rpComparator).create(); List<RegionPlan> regionsToReturn = new ArrayList<RegionPlan>(); // Walk down most loaded, pruning each to the max int serversOverloaded = 0; // flag used to fetch regions from head and tail of list, alternately boolean fetchFromTail = false; Map<ServerName, BalanceInfo> serverBalanceInfo = new TreeMap<ServerName, BalanceInfo>(); for (Map.Entry<ServerAndLoad, List<HRegionInfo>> server : serversByLoad.descendingMap().entrySet()) { ServerAndLoad sal = server.getKey(); int regionCount = sal.getLoad(); if (regionCount <= max) { serverBalanceInfo.put(sal.getServerName(), new BalanceInfo(0, 0)); break; } serversOverloaded++; List<HRegionInfo> regions = server.getValue(); int numToOffload = Math.min(regionCount - max, regions.size()); // account for the out-of-band regions which were assigned to this server // after some other region server crashed Collections.sort(regions, riComparator); int numTaken = 0; for (int i = 0; i <= numToOffload;) { HRegionInfo hri = regions.get(i); // fetch from head if (fetchFromTail) { hri = regions.get(regions.size() - 1 - i); } i++; // Don't rebalance meta regions. if (hri.isMetaRegion()) continue; regionsToMove.add(new RegionPlan(hri, sal.getServerName(), null)); numTaken++; if (numTaken >= numToOffload) break; // fetch in alternate order if there is new region server if (emptyRegionServerPresent) { fetchFromTail = !fetchFromTail; } } serverBalanceInfo.put(sal.getServerName(), new BalanceInfo(numToOffload, (-1) * numTaken)); } int totalNumMoved = regionsToMove.size(); // Walk down least loaded, filling each to the min int neededRegions = 0; // number of regions needed to bring all up to min fetchFromTail = false; Map<ServerName, Integer> underloadedServers = new HashMap<ServerName, Integer>(); float average = (float) numRegions / numServers; // for logging int maxToTake = numRegions - (int) average; for (Map.Entry<ServerAndLoad, List<HRegionInfo>> server : serversByLoad.entrySet()) { if (maxToTake == 0) break; // no more to take int regionCount = server.getKey().getLoad(); if (regionCount >= min && regionCount > 0) { continue; // look for other servers which haven't reached min } int regionsToPut = min - regionCount; if (regionsToPut == 0) { regionsToPut = 1; maxToTake--; } underloadedServers.put(server.getKey().getServerName(), regionsToPut); } // number of servers that get new regions int serversUnderloaded = underloadedServers.size(); int incr = 1; List<ServerName> sns = Arrays .asList(underloadedServers.keySet().toArray(new ServerName[serversUnderloaded])); Collections.shuffle(sns, RANDOM); while (regionsToMove.size() > 0) { int cnt = 0; int i = incr > 0 ? 0 : underloadedServers.size() - 1; for (; i >= 0 && i < underloadedServers.size(); i += incr) { if (regionsToMove.isEmpty()) break; ServerName si = sns.get(i); int numToTake = underloadedServers.get(si); if (numToTake == 0) continue; addRegionPlan(regionsToMove, fetchFromTail, si, regionsToReturn); if (emptyRegionServerPresent) { fetchFromTail = !fetchFromTail; } underloadedServers.put(si, numToTake - 1); cnt++; BalanceInfo bi = serverBalanceInfo.get(si); if (bi == null) { bi = new BalanceInfo(0, 0); serverBalanceInfo.put(si, bi); } bi.setNumRegionsAdded(bi.getNumRegionsAdded() + 1); } if (cnt == 0) break; // iterates underloadedServers in the other direction incr = -incr; } for (Integer i : underloadedServers.values()) { // If we still want to take some, increment needed neededRegions += i; } // If none needed to fill all to min and none left to drain all to max, // we are done if (neededRegions == 0 && regionsToMove.isEmpty()) { long endTime = System.currentTimeMillis(); LOG.info("Calculated a load balance in " + (endTime - startTime) + "ms. " + "Moving " + totalNumMoved + " regions off of " + serversOverloaded + " overloaded servers onto " + serversUnderloaded + " less loaded servers"); return regionsToReturn; } // Need to do a second pass. // Either more regions to assign out or servers that are still underloaded // If we need more to fill min, grab one from each most loaded until enough if (neededRegions != 0) { // Walk down most loaded, grabbing one from each until we get enough for (Map.Entry<ServerAndLoad, List<HRegionInfo>> server : serversByLoad.descendingMap().entrySet()) { BalanceInfo balanceInfo = serverBalanceInfo.get(server.getKey().getServerName()); int idx = balanceInfo == null ? 0 : balanceInfo.getNextRegionForUnload(); if (idx >= server.getValue().size()) break; HRegionInfo region = server.getValue().get(idx); if (region.isMetaRegion()) continue; // Don't move meta regions. regionsToMove.add(new RegionPlan(region, server.getKey().getServerName(), null)); totalNumMoved++; if (--neededRegions == 0) { // No more regions needed, done shedding break; } } } // Now we have a set of regions that must be all assigned out // Assign each underloaded up to the min, then if leftovers, assign to max // Walk down least loaded, assigning to each to fill up to min for (Map.Entry<ServerAndLoad, List<HRegionInfo>> server : serversByLoad.entrySet()) { int regionCount = server.getKey().getLoad(); if (regionCount >= min) break; BalanceInfo balanceInfo = serverBalanceInfo.get(server.getKey().getServerName()); if (balanceInfo != null) { regionCount += balanceInfo.getNumRegionsAdded(); } if (regionCount >= min) { continue; } int numToTake = min - regionCount; int numTaken = 0; while (numTaken < numToTake && 0 < regionsToMove.size()) { addRegionPlan(regionsToMove, fetchFromTail, server.getKey().getServerName(), regionsToReturn); numTaken++; if (emptyRegionServerPresent) { fetchFromTail = !fetchFromTail; } } } // If we still have regions to dish out, assign underloaded to max if (0 < regionsToMove.size()) { for (Map.Entry<ServerAndLoad, List<HRegionInfo>> server : serversByLoad.entrySet()) { int regionCount = server.getKey().getLoad(); if (regionCount >= max) { break; } addRegionPlan(regionsToMove, fetchFromTail, server.getKey().getServerName(), regionsToReturn); if (emptyRegionServerPresent) { fetchFromTail = !fetchFromTail; } if (regionsToMove.isEmpty()) { break; } } } long endTime = System.currentTimeMillis(); if (!regionsToMove.isEmpty() || neededRegions != 0) { // Emit data so can diagnose how balancer went astray. LOG.warn("regionsToMove=" + totalNumMoved + ", numServers=" + numServers + ", serversOverloaded=" + serversOverloaded + ", serversUnderloaded=" + serversUnderloaded); StringBuilder sb = new StringBuilder(); for (Map.Entry<ServerName, List<HRegionInfo>> e : clusterMap.entrySet()) { if (sb.length() > 0) sb.append(", "); sb.append(e.getKey().toString()); sb.append(" "); sb.append(e.getValue().size()); } LOG.warn("Input " + sb.toString()); } // All done! LOG.info("Done. Calculated a load balance in " + (endTime - startTime) + "ms. " + "Moving " + totalNumMoved + " regions off of " + serversOverloaded + " overloaded servers onto " + serversUnderloaded + " less loaded servers"); return regionsToReturn; }
From source file:com.mirth.connect.server.controllers.DefaultExtensionController.java
@Override public void initPlugins() { // Order all the plugins by their weight before loading any of them. Map<String, String> pluginNameMap = new HashMap<String, String>(); NavigableMap<Integer, List<String>> weightedPlugins = new TreeMap<Integer, List<String>>(); for (PluginMetaData pmd : getPluginMetaData().values()) { if (isExtensionEnabled(pmd.getName())) { if (pmd.getServerClasses() != null) { for (PluginClass pluginClass : pmd.getServerClasses()) { String clazzName = pluginClass.getName(); int weight = pluginClass.getWeight(); pluginNameMap.put(clazzName, pmd.getName()); List<String> classList = weightedPlugins.get(weight); if (classList == null) { classList = new ArrayList<String>(); weightedPlugins.put(weight, classList); }/* www . j a va2 s . com*/ classList.add(clazzName); } } } else { logger.warn("Plugin \"" + pmd.getName() + "\" is not enabled."); } } // Load the plugins in order of their weight for (List<String> classList : weightedPlugins.descendingMap().values()) { for (String clazzName : classList) { String pluginName = pluginNameMap.get(clazzName); try { ServerPlugin serverPlugin = (ServerPlugin) Class.forName(clazzName).newInstance(); if (serverPlugin instanceof ServicePlugin) { ServicePlugin servicePlugin = (ServicePlugin) serverPlugin; /* * load any properties that may currently be in the database */ Properties currentProperties = getPluginProperties(pluginName); /* get the default properties for the plugin */ Properties defaultProperties = servicePlugin.getDefaultProperties(); /* * if there are any properties that not currently set, set them to the the * default */ for (Object key : defaultProperties.keySet()) { if (!currentProperties.containsKey(key)) { currentProperties.put(key, defaultProperties.get(key)); } } /* save the properties to the database */ setPluginProperties(pluginName, currentProperties); /* * initialize the plugin with those properties and add it to the list of * loaded plugins */ servicePlugin.init(currentProperties); servicePlugins.put(servicePlugin.getPluginPointName(), servicePlugin); serverPlugins.add(servicePlugin); logger.debug("sucessfully loaded server plugin: " + serverPlugin.getPluginPointName()); } if (serverPlugin instanceof ChannelPlugin) { ChannelPlugin channelPlugin = (ChannelPlugin) serverPlugin; channelPlugins.put(channelPlugin.getPluginPointName(), channelPlugin); serverPlugins.add(channelPlugin); logger.debug( "sucessfully loaded server channel plugin: " + serverPlugin.getPluginPointName()); } if (serverPlugin instanceof CodeTemplateServerPlugin) { CodeTemplateServerPlugin codeTemplateServerPlugin = (CodeTemplateServerPlugin) serverPlugin; codeTemplateServerPlugins.put(codeTemplateServerPlugin.getPluginPointName(), codeTemplateServerPlugin); serverPlugins.add(codeTemplateServerPlugin); logger.debug("sucessfully loaded server code template plugin: " + serverPlugin.getPluginPointName()); } if (serverPlugin instanceof DataTypeServerPlugin) { DataTypeServerPlugin dataTypePlugin = (DataTypeServerPlugin) serverPlugin; dataTypePlugins.put(dataTypePlugin.getPluginPointName(), dataTypePlugin); serverPlugins.add(dataTypePlugin); logger.debug( "sucessfully loaded server data type plugin: " + serverPlugin.getPluginPointName()); } if (serverPlugin instanceof ResourcePlugin) { ResourcePlugin resourcePlugin = (ResourcePlugin) serverPlugin; resourcePlugins.put(resourcePlugin.getPluginPointName(), resourcePlugin); serverPlugins.add(resourcePlugin); logger.debug("Successfully loaded resource plugin: " + resourcePlugin.getPluginPointName()); } if (serverPlugin instanceof TransmissionModeProvider) { TransmissionModeProvider transmissionModeProvider = (TransmissionModeProvider) serverPlugin; transmissionModeProviders.put(transmissionModeProvider.getPluginPointName(), transmissionModeProvider); serverPlugins.add(transmissionModeProvider); logger.debug("Successfully loaded transmission mode provider plugin: " + transmissionModeProvider.getPluginPointName()); } if (serverPlugin instanceof AuthorizationPlugin) { AuthorizationPlugin authorizationPlugin = (AuthorizationPlugin) serverPlugin; if (this.authorizationPlugin != null) { throw new Exception("Multiple Authorization Plugins are not permitted."); } this.authorizationPlugin = authorizationPlugin; serverPlugins.add(authorizationPlugin); logger.debug("sucessfully loaded server authorization plugin: " + serverPlugin.getPluginPointName()); } } catch (Exception e) { logger.error("Error instantiating plugin: " + pluginName, e); } } } }
From source file:org.apache.hadoop.hbase.master.balancer.SimpleLoadBalancer.java
/** * Generate a global load balancing plan according to the specified map of * server information to the most loaded regions of each server. * * The load balancing invariant is that all servers are within 1 region of the * average number of regions per server. If the average is an integer number, * all servers will be balanced to the average. Otherwise, all servers will * have either floor(average) or ceiling(average) regions. * * HBASE-3609 Modeled regionsToMove using Guava's MinMaxPriorityQueue so that * we can fetch from both ends of the queue. * At the beginning, we check whether there was empty region server * just discovered by Master. If so, we alternately choose new / old * regions from head / tail of regionsToMove, respectively. This alternation * avoids clustering young regions on the newly discovered region server. * Otherwise, we choose new regions from head of regionsToMove. * /* w w w . jav a 2 s .co m*/ * Another improvement from HBASE-3609 is that we assign regions from * regionsToMove to underloaded servers in round-robin fashion. * Previously one underloaded server would be filled before we move onto * the next underloaded server, leading to clustering of young regions. * * Finally, we randomly shuffle underloaded servers so that they receive * offloaded regions relatively evenly across calls to balanceCluster(). * * The algorithm is currently implemented as such: * * <ol> * <li>Determine the two valid numbers of regions each server should have, * <b>MIN</b>=floor(average) and <b>MAX</b>=ceiling(average). * * <li>Iterate down the most loaded servers, shedding regions from each so * each server hosts exactly <b>MAX</b> regions. Stop once you reach a * server that already has <= <b>MAX</b> regions. * <p> * Order the regions to move from most recent to least. * * <li>Iterate down the least loaded servers, assigning regions so each server * has exactly </b>MIN</b> regions. Stop once you reach a server that * already has >= <b>MIN</b> regions. * * Regions being assigned to underloaded servers are those that were shed * in the previous step. It is possible that there were not enough * regions shed to fill each underloaded server to <b>MIN</b>. If so we * end up with a number of regions required to do so, <b>neededRegions</b>. * * It is also possible that we were able to fill each underloaded but ended * up with regions that were unassigned from overloaded servers but that * still do not have assignment. * * If neither of these conditions hold (no regions needed to fill the * underloaded servers, no regions leftover from overloaded servers), * we are done and return. Otherwise we handle these cases below. * * <li>If <b>neededRegions</b> is non-zero (still have underloaded servers), * we iterate the most loaded servers again, shedding a single server from * each (this brings them from having <b>MAX</b> regions to having * <b>MIN</b> regions). * * <li>We now definitely have more regions that need assignment, either from * the previous step or from the original shedding from overloaded servers. * Iterate the least loaded servers filling each to <b>MIN</b>. * * <li>If we still have more regions that need assignment, again iterate the * least loaded servers, this time giving each one (filling them to * </b>MAX</b>) until we run out. * * <li>All servers will now either host <b>MIN</b> or <b>MAX</b> regions. * * In addition, any server hosting >= <b>MAX</b> regions is guaranteed * to end up with <b>MAX</b> regions at the end of the balancing. This * ensures the minimal number of regions possible are moved. * </ol> * * TODO: We can at-most reassign the number of regions away from a particular * server to be how many they report as most loaded. * Should we just keep all assignment in memory? Any objections? * Does this mean we need HeapSize on HMaster? Or just careful monitor? * (current thinking is we will hold all assignments in memory) * * @param clusterMap Map of regionservers and their load/region information to * a list of their most loaded regions * @return a list of regions to be moved, including source and destination, * or null if cluster is already balanced */ public List<RegionPlan> balanceCluster(Map<ServerName, List<HRegionInfo>> clusterMap) { List<RegionPlan> regionsToReturn = balanceMasterRegions(clusterMap); if (regionsToReturn != null) { return regionsToReturn; } filterExcludedServers(clusterMap); boolean emptyRegionServerPresent = false; long startTime = System.currentTimeMillis(); Collection<ServerName> backupMasters = getBackupMasters(); ClusterLoadState cs = new ClusterLoadState(masterServerName, backupMasters, backupMasterWeight, clusterMap); if (!this.needsBalance(cs)) return null; int numServers = cs.getNumServers(); NavigableMap<ServerAndLoad, List<HRegionInfo>> serversByLoad = cs.getServersByLoad(); int numRegions = cs.getNumRegions(); float average = cs.getLoadAverage(); int max = (int) Math.ceil(average); int min = (int) average; // Using to check balance result. StringBuilder strBalanceParam = new StringBuilder(); strBalanceParam.append("Balance parameter: numRegions=").append(numRegions).append(", numServers=") .append(numServers).append(", numBackupMasters=").append(cs.getNumBackupMasters()) .append(", backupMasterWeight=").append(backupMasterWeight).append(", max=").append(max) .append(", min=").append(min); LOG.debug(strBalanceParam.toString()); // Balance the cluster // TODO: Look at data block locality or a more complex load to do this MinMaxPriorityQueue<RegionPlan> regionsToMove = MinMaxPriorityQueue.orderedBy(rpComparator).create(); regionsToReturn = new ArrayList<RegionPlan>(); // Walk down most loaded, pruning each to the max int serversOverloaded = 0; // flag used to fetch regions from head and tail of list, alternately boolean fetchFromTail = false; Map<ServerName, BalanceInfo> serverBalanceInfo = new TreeMap<ServerName, BalanceInfo>(); for (Map.Entry<ServerAndLoad, List<HRegionInfo>> server : serversByLoad.descendingMap().entrySet()) { ServerAndLoad sal = server.getKey(); int load = sal.getLoad(); if (load <= max) { serverBalanceInfo.put(sal.getServerName(), new BalanceInfo(0, 0)); break; } serversOverloaded++; List<HRegionInfo> regions = server.getValue(); int w = 1; // Normal region server has weight 1 if (backupMasters != null && backupMasters.contains(sal.getServerName())) { w = backupMasterWeight; // Backup master has heavier weight } int numToOffload = Math.min((load - max) / w, regions.size()); // account for the out-of-band regions which were assigned to this server // after some other region server crashed Collections.sort(regions, riComparator); int numTaken = 0; for (int i = 0; i <= numToOffload;) { HRegionInfo hri = regions.get(i); // fetch from head if (fetchFromTail) { hri = regions.get(regions.size() - 1 - i); } i++; // Don't rebalance special regions. if (shouldBeOnMaster(hri) && masterServerName.equals(sal.getServerName())) continue; regionsToMove.add(new RegionPlan(hri, sal.getServerName(), null)); numTaken++; if (numTaken >= numToOffload) break; // fetch in alternate order if there is new region server if (emptyRegionServerPresent) { fetchFromTail = !fetchFromTail; } } serverBalanceInfo.put(sal.getServerName(), new BalanceInfo(numToOffload, (-1) * numTaken)); } int totalNumMoved = regionsToMove.size(); // Walk down least loaded, filling each to the min int neededRegions = 0; // number of regions needed to bring all up to min fetchFromTail = false; Map<ServerName, Integer> underloadedServers = new HashMap<ServerName, Integer>(); int maxToTake = numRegions - min; for (Map.Entry<ServerAndLoad, List<HRegionInfo>> server : serversByLoad.entrySet()) { if (maxToTake == 0) break; // no more to take int load = server.getKey().getLoad(); if (load >= min && load > 0) { continue; // look for other servers which haven't reached min } int w = 1; // Normal region server has weight 1 if (backupMasters != null && backupMasters.contains(server.getKey().getServerName())) { w = backupMasterWeight; // Backup master has heavier weight } int regionsToPut = (min - load) / w; if (regionsToPut == 0) { regionsToPut = 1; } maxToTake -= regionsToPut; underloadedServers.put(server.getKey().getServerName(), regionsToPut); } // number of servers that get new regions int serversUnderloaded = underloadedServers.size(); int incr = 1; List<ServerName> sns = Arrays .asList(underloadedServers.keySet().toArray(new ServerName[serversUnderloaded])); Collections.shuffle(sns, RANDOM); while (regionsToMove.size() > 0) { int cnt = 0; int i = incr > 0 ? 0 : underloadedServers.size() - 1; for (; i >= 0 && i < underloadedServers.size(); i += incr) { if (regionsToMove.isEmpty()) break; ServerName si = sns.get(i); int numToTake = underloadedServers.get(si); if (numToTake == 0) continue; addRegionPlan(regionsToMove, fetchFromTail, si, regionsToReturn); if (emptyRegionServerPresent) { fetchFromTail = !fetchFromTail; } underloadedServers.put(si, numToTake - 1); cnt++; BalanceInfo bi = serverBalanceInfo.get(si); if (bi == null) { bi = new BalanceInfo(0, 0); serverBalanceInfo.put(si, bi); } bi.setNumRegionsAdded(bi.getNumRegionsAdded() + 1); } if (cnt == 0) break; // iterates underloadedServers in the other direction incr = -incr; } for (Integer i : underloadedServers.values()) { // If we still want to take some, increment needed neededRegions += i; } // If none needed to fill all to min and none left to drain all to max, // we are done if (neededRegions == 0 && regionsToMove.isEmpty()) { long endTime = System.currentTimeMillis(); LOG.info("Calculated a load balance in " + (endTime - startTime) + "ms. " + "Moving " + totalNumMoved + " regions off of " + serversOverloaded + " overloaded servers onto " + serversUnderloaded + " less loaded servers"); return regionsToReturn; } // Need to do a second pass. // Either more regions to assign out or servers that are still underloaded // If we need more to fill min, grab one from each most loaded until enough if (neededRegions != 0) { // Walk down most loaded, grabbing one from each until we get enough for (Map.Entry<ServerAndLoad, List<HRegionInfo>> server : serversByLoad.descendingMap().entrySet()) { BalanceInfo balanceInfo = serverBalanceInfo.get(server.getKey().getServerName()); int idx = balanceInfo == null ? 0 : balanceInfo.getNextRegionForUnload(); if (idx >= server.getValue().size()) break; HRegionInfo region = server.getValue().get(idx); if (region.isMetaRegion()) continue; // Don't move meta regions. regionsToMove.add(new RegionPlan(region, server.getKey().getServerName(), null)); totalNumMoved++; if (--neededRegions == 0) { // No more regions needed, done shedding break; } } } // Now we have a set of regions that must be all assigned out // Assign each underloaded up to the min, then if leftovers, assign to max // Walk down least loaded, assigning to each to fill up to min for (Map.Entry<ServerAndLoad, List<HRegionInfo>> server : serversByLoad.entrySet()) { int regionCount = server.getKey().getLoad(); if (regionCount >= min) break; BalanceInfo balanceInfo = serverBalanceInfo.get(server.getKey().getServerName()); if (balanceInfo != null) { regionCount += balanceInfo.getNumRegionsAdded(); } if (regionCount >= min) { continue; } int numToTake = min - regionCount; int numTaken = 0; while (numTaken < numToTake && 0 < regionsToMove.size()) { addRegionPlan(regionsToMove, fetchFromTail, server.getKey().getServerName(), regionsToReturn); numTaken++; if (emptyRegionServerPresent) { fetchFromTail = !fetchFromTail; } } } // If we still have regions to dish out, assign underloaded to max if (0 < regionsToMove.size()) { for (Map.Entry<ServerAndLoad, List<HRegionInfo>> server : serversByLoad.entrySet()) { int regionCount = server.getKey().getLoad(); BalanceInfo balanceInfo = serverBalanceInfo.get(server.getKey().getServerName()); if (balanceInfo != null) { regionCount += balanceInfo.getNumRegionsAdded(); } if (regionCount >= max) { break; } addRegionPlan(regionsToMove, fetchFromTail, server.getKey().getServerName(), regionsToReturn); if (emptyRegionServerPresent) { fetchFromTail = !fetchFromTail; } if (regionsToMove.isEmpty()) { break; } } } long endTime = System.currentTimeMillis(); if (!regionsToMove.isEmpty() || neededRegions != 0) { // Emit data so can diagnose how balancer went astray. LOG.warn("regionsToMove=" + totalNumMoved + ", numServers=" + numServers + ", serversOverloaded=" + serversOverloaded + ", serversUnderloaded=" + serversUnderloaded); StringBuilder sb = new StringBuilder(); for (Map.Entry<ServerName, List<HRegionInfo>> e : clusterMap.entrySet()) { if (sb.length() > 0) sb.append(", "); sb.append(e.getKey().toString()); sb.append(" "); sb.append(e.getValue().size()); } LOG.warn("Input " + sb.toString()); } // All done! LOG.info("Done. Calculated a load balance in " + (endTime - startTime) + "ms. " + "Moving " + totalNumMoved + " regions off of " + serversOverloaded + " overloaded servers onto " + serversUnderloaded + " less loaded servers"); return regionsToReturn; }
From source file:com.alibaba.wasp.master.balancer.DefaultLoadBalancer.java
/** * Generate a global load balancing plan according to the specified map of * server information to the most loaded entityGroups of each server. * //from w ww .ja v a 2 s . c om * The load balancing invariant is that all servers are within 1 entityGroup of the * average number of entityGroups per server. If the average is an integer number, * all servers will be balanced to the average. Otherwise, all servers will * have either floor(average) or ceiling(average) entityGroups. * * HBASE-3609 Modeled entityGroupsToMove using Guava's MinMaxPriorityQueue so that * we can fetch from both ends of the queue. At the beginning, we check * whether there was empty entityGroup server just discovered by Master. If so, we * alternately choose new / old entityGroups from head / tail of entityGroupsToMove, * respectively. This alternation avoids clustering young entityGroups on the newly * discovered entityGroup server. Otherwise, we choose new entityGroups from head of * entityGroupsToMove. * * Another improvement from HBASE-3609 is that we assign entityGroups from * entityGroupsToMove to underloaded servers in round-robin fashion. Previously one * underloaded server would be filled before we move onto the next underloaded * server, leading to clustering of young entityGroups. * * Finally, we randomly shuffle underloaded servers so that they receive * offloaded entityGroups relatively evenly across calls to balanceCluster(). * * The algorithm is currently implemented as such: * * <ol> * <li>Determine the two valid numbers of entityGroups each server should have, * <b>MIN</b>=floor(average) and <b>MAX</b>=ceiling(average). * * <li>Iterate down the most loaded servers, shedding entityGroups from each so * each server hosts exactly <b>MAX</b> entityGroups. Stop once you reach a server * that already has <= <b>MAX</b> entityGroups. * <p> * Order the entityGroups to move from most recent to least. * * <li>Iterate down the least loaded servers, assigning entityGroups so each server * has exactly </b>MIN</b> entityGroups. Stop once you reach a server that already * has >= <b>MIN</b> entityGroups. * * EntityGroups being assigned to underloaded servers are those that were shed in * the previous step. It is possible that there were not enough entityGroups shed * to fill each underloaded server to <b>MIN</b>. If so we end up with a * number of entityGroups required to do so, <b>neededEntityGroups</b>. * * It is also possible that we were able to fill each underloaded but ended up * with entityGroups that were unassigned from overloaded servers but that still do * not have assignment. * * If neither of these conditions hold (no entityGroups needed to fill the * underloaded servers, no entityGroups leftover from overloaded servers), we are * done and return. Otherwise we handle these cases below. * * <li>If <b>neededEntityGroups</b> is non-zero (still have underloaded servers), * we iterate the most loaded servers again, shedding a single server from * each (this brings them from having <b>MAX</b> entityGroups to having <b>MIN</b> * entityGroups). * * <li>We now definitely have more entityGroups that need assignment, either from * the previous step or from the original shedding from overloaded servers. * Iterate the least loaded servers filling each to <b>MIN</b>. * * <li>If we still have more entityGroups that need assignment, again iterate the * least loaded servers, this time giving each one (filling them to * </b>MAX</b>) until we run out. * * <li>All servers will now either host <b>MIN</b> or <b>MAX</b> entityGroups. * * In addition, any server hosting >= <b>MAX</b> entityGroups is guaranteed to * end up with <b>MAX</b> entityGroups at the end of the balancing. This ensures * the minimal number of entityGroups possible are moved. * </ol> * * TODO: We can at-most reassign the number of entityGroups away from a particular * server to be how many they report as most loaded. Should we just keep all * assignment in memory? Any objections? Does this mean we need HeapSize on * HMaster? Or just careful monitor? (current thinking is we will hold all * assignments in memory) * * @param clusterState Map of entityGroupservers and their load/entityGroup information * to a list of their most loaded entityGroups * @return a list of entityGroups to be moved, including source and destination, or * null if cluster is already balanced */ public List<EntityGroupPlan> balanceCluster(Map<ServerName, List<EntityGroupInfo>> clusterMap) { boolean emptyFServerPresent = false; long startTime = System.currentTimeMillis(); ClusterLoadState cs = new ClusterLoadState(clusterMap); int numServers = cs.getNumServers(); if (numServers == 0) { LOG.debug("numServers=0 so skipping load balancing"); return null; } NavigableMap<ServerAndLoad, List<EntityGroupInfo>> serversByLoad = cs.getServersByLoad(); int numEntityGroups = cs.getNumEntityGroups(); if (!this.needsBalance(cs)) { // Skipped because no server outside (min,max) range float average = cs.getLoadAverage(); // for logging LOG.info("Skipping load balancing because balanced cluster; " + "servers=" + numServers + " " + "entityGroups=" + numEntityGroups + " average=" + average + " " + "mostloaded=" + serversByLoad.lastKey().getLoad() + " leastloaded=" + serversByLoad.firstKey().getLoad()); return null; } int min = numEntityGroups / numServers; int max = numEntityGroups % numServers == 0 ? min : min + 1; // Using to check balance result. StringBuilder strBalanceParam = new StringBuilder(); strBalanceParam.append("Balance parameter: numEntityGroups=").append(numEntityGroups) .append(", numServers=").append(numServers).append(", max=").append(max).append(", min=") .append(min); LOG.debug(strBalanceParam.toString()); // Balance the cluster // TODO: Look at data block locality or a more complex load to do this MinMaxPriorityQueue<EntityGroupPlan> entityGroupsToMove = MinMaxPriorityQueue.orderedBy(rpComparator) .create(); List<EntityGroupPlan> entityGroupsToReturn = new ArrayList<EntityGroupPlan>(); // Walk down most loaded, pruning each to the max int serversOverloaded = 0; // flag used to fetch entityGroups from head and tail of list, alternately boolean fetchFromTail = false; Map<ServerName, BalanceInfo> serverBalanceInfo = new TreeMap<ServerName, BalanceInfo>(); for (Map.Entry<ServerAndLoad, List<EntityGroupInfo>> server : serversByLoad.descendingMap().entrySet()) { ServerAndLoad sal = server.getKey(); int entityGroupCount = sal.getLoad(); if (entityGroupCount <= max) { serverBalanceInfo.put(sal.getServerName(), new BalanceInfo(0, 0)); break; } serversOverloaded++; List<EntityGroupInfo> entityGroups = server.getValue(); int numToOffload = Math.min(entityGroupCount - max, entityGroups.size()); // account for the out-of-band entityGroups which were assigned to this server // after some other entityGroup server crashed Collections.sort(entityGroups, riComparator); int numTaken = 0; for (int i = 0; i <= numToOffload;) { EntityGroupInfo egInfo = entityGroups.get(i); // fetch from head if (fetchFromTail) { egInfo = entityGroups.get(entityGroups.size() - 1 - i); } i++; entityGroupsToMove.add(new EntityGroupPlan(egInfo, sal.getServerName(), null)); numTaken++; if (numTaken >= numToOffload) break; // fetch in alternate order if there is new entityGroup server if (emptyFServerPresent) { fetchFromTail = !fetchFromTail; } } serverBalanceInfo.put(sal.getServerName(), new BalanceInfo(numToOffload, (-1) * numTaken)); } int totalNumMoved = entityGroupsToMove.size(); // Walk down least loaded, filling each to the min int neededEntityGroups = 0; // number of entityGroups needed to bring all up to min fetchFromTail = false; Map<ServerName, Integer> underloadedServers = new HashMap<ServerName, Integer>(); for (Map.Entry<ServerAndLoad, List<EntityGroupInfo>> server : serversByLoad.entrySet()) { int entityGroupCount = server.getKey().getLoad(); if (entityGroupCount >= min) { break; } underloadedServers.put(server.getKey().getServerName(), min - entityGroupCount); } // number of servers that get new entityGroups int serversUnderloaded = underloadedServers.size(); int incr = 1; List<ServerName> sns = Arrays .asList(underloadedServers.keySet().toArray(new ServerName[serversUnderloaded])); Collections.shuffle(sns, RANDOM); while (entityGroupsToMove.size() > 0) { int cnt = 0; int i = incr > 0 ? 0 : underloadedServers.size() - 1; for (; i >= 0 && i < underloadedServers.size(); i += incr) { if (entityGroupsToMove.isEmpty()) break; ServerName si = sns.get(i); int numToTake = underloadedServers.get(si); if (numToTake == 0) continue; addEntityGroupPlan(entityGroupsToMove, fetchFromTail, si, entityGroupsToReturn); if (emptyFServerPresent) { fetchFromTail = !fetchFromTail; } underloadedServers.put(si, numToTake - 1); cnt++; BalanceInfo bi = serverBalanceInfo.get(si); if (bi == null) { bi = new BalanceInfo(0, 0); serverBalanceInfo.put(si, bi); } bi.setNumEntityGroupsAdded(bi.getNumEntityGroupsAdded() + 1); } if (cnt == 0) break; // iterates underloadedServers in the other direction incr = -incr; } for (Integer i : underloadedServers.values()) { // If we still want to take some, increment needed neededEntityGroups += i; } // If none needed to fill all to min and none left to drain all to max, // we are done if (neededEntityGroups == 0 && entityGroupsToMove.isEmpty()) { long endTime = System.currentTimeMillis(); LOG.info("Calculated a load balance in " + (endTime - startTime) + "ms. " + "Moving " + totalNumMoved + " entityGroups off of " + serversOverloaded + " overloaded servers onto " + serversUnderloaded + " less loaded servers"); return entityGroupsToReturn; } // Need to do a second pass. // Either more entityGroups to assign out or servers that are still underloaded // If we need more to fill min, grab one from each most loaded until enough if (neededEntityGroups != 0) { // Walk down most loaded, grabbing one from each until we get enough for (Map.Entry<ServerAndLoad, List<EntityGroupInfo>> server : serversByLoad.descendingMap() .entrySet()) { BalanceInfo balanceInfo = serverBalanceInfo.get(server.getKey().getServerName()); int idx = balanceInfo == null ? 0 : balanceInfo.getNextEntityGroupForUnload(); if (idx >= server.getValue().size()) break; EntityGroupInfo entityGroup = server.getValue().get(idx); entityGroupsToMove.add(new EntityGroupPlan(entityGroup, server.getKey().getServerName(), null)); totalNumMoved++; if (--neededEntityGroups == 0) { // No more entityGroups needed, done shedding break; } } } // Now we have a set of entityGroups that must be all assigned out // Assign each underloaded up to the min, then if leftovers, assign to max // Walk down least loaded, assigning to each to fill up to min for (Map.Entry<ServerAndLoad, List<EntityGroupInfo>> server : serversByLoad.entrySet()) { int entityGroupCount = server.getKey().getLoad(); if (entityGroupCount >= min) break; BalanceInfo balanceInfo = serverBalanceInfo.get(server.getKey().getServerName()); if (balanceInfo != null) { entityGroupCount += balanceInfo.getNumEntityGroupsAdded(); } if (entityGroupCount >= min) { continue; } int numToTake = min - entityGroupCount; int numTaken = 0; while (numTaken < numToTake && 0 < entityGroupsToMove.size()) { addEntityGroupPlan(entityGroupsToMove, fetchFromTail, server.getKey().getServerName(), entityGroupsToReturn); numTaken++; if (emptyFServerPresent) { fetchFromTail = !fetchFromTail; } } } // If we still have entityGroups to dish out, assign underloaded to max if (0 < entityGroupsToMove.size()) { for (Map.Entry<ServerAndLoad, List<EntityGroupInfo>> server : serversByLoad.entrySet()) { int entityGroupCount = server.getKey().getLoad(); if (entityGroupCount >= max) { break; } addEntityGroupPlan(entityGroupsToMove, fetchFromTail, server.getKey().getServerName(), entityGroupsToReturn); if (emptyFServerPresent) { fetchFromTail = !fetchFromTail; } if (entityGroupsToMove.isEmpty()) { break; } } } long endTime = System.currentTimeMillis(); if (!entityGroupsToMove.isEmpty() || neededEntityGroups != 0) { // Emit data so can diagnose how balancer went astray. LOG.warn("entityGroupsToMove=" + totalNumMoved + ", numServers=" + numServers + ", serversOverloaded=" + serversOverloaded + ", serversUnderloaded=" + serversUnderloaded); StringBuilder sb = new StringBuilder(); for (Map.Entry<ServerName, List<EntityGroupInfo>> e : clusterMap.entrySet()) { if (sb.length() > 0) sb.append(", "); sb.append(e.getKey().toString()); sb.append(" "); sb.append(e.getValue().size()); } LOG.warn("Input " + sb.toString()); } // All done! LOG.info("Done. Calculated a load balance in " + (endTime - startTime) + "ms. " + "Moving " + totalNumMoved + " entityGroups off of " + serversOverloaded + " overloaded servers onto " + serversUnderloaded + " less loaded servers"); return entityGroupsToReturn; }
From source file:org.apache.hadoop.hbase.master.DefaultLoadBalancer.java
/** * Generate a global load balancing plan according to the specified map of * server information to the most loaded regions of each server. * * The load balancing invariant is that all servers are within 1 region of the * average number of regions per server. If the average is an integer number, * all servers will be balanced to the average. Otherwise, all servers will * have either floor(average) or ceiling(average) regions. * * HBASE-3609 Modeled regionsToMove using Guava's MinMaxPriorityQueue so that * we can fetch from both ends of the queue. * At the beginning, we check whether there was empty region server * just discovered by Master. If so, we alternately choose new / old * regions from head / tail of regionsToMove, respectively. This alternation * avoids clustering young regions on the newly discovered region server. * Otherwise, we choose new regions from head of regionsToMove. * /* w w w .j a v a 2s . com*/ * Another improvement from HBASE-3609 is that we assign regions from * regionsToMove to underloaded servers in round-robin fashion. * Previously one underloaded server would be filled before we move onto * the next underloaded server, leading to clustering of young regions. * * Finally, we randomly shuffle underloaded servers so that they receive * offloaded regions relatively evenly across calls to balanceCluster(). * * The algorithm is currently implemented as such: * * <ol> * <li>Determine the two valid numbers of regions each server should have, * <b>MIN</b>=floor(average) and <b>MAX</b>=ceiling(average). * * <li>Iterate down the most loaded servers, shedding regions from each so * each server hosts exactly <b>MAX</b> regions. Stop once you reach a * server that already has <= <b>MAX</b> regions. * <p> * Order the regions to move from most recent to least. * * <li>Iterate down the least loaded servers, assigning regions so each server * has exactly </b>MIN</b> regions. Stop once you reach a server that * already has >= <b>MIN</b> regions. * * Regions being assigned to underloaded servers are those that were shed * in the previous step. It is possible that there were not enough * regions shed to fill each underloaded server to <b>MIN</b>. If so we * end up with a number of regions required to do so, <b>neededRegions</b>. * * It is also possible that we were able to fill each underloaded but ended * up with regions that were unassigned from overloaded servers but that * still do not have assignment. * * If neither of these conditions hold (no regions needed to fill the * underloaded servers, no regions leftover from overloaded servers), * we are done and return. Otherwise we handle these cases below. * * <li>If <b>neededRegions</b> is non-zero (still have underloaded servers), * we iterate the most loaded servers again, shedding a single server from * each (this brings them from having <b>MAX</b> regions to having * <b>MIN</b> regions). * * <li>We now definitely have more regions that need assignment, either from * the previous step or from the original shedding from overloaded servers. * Iterate the least loaded servers filling each to <b>MIN</b>. * * <li>If we still have more regions that need assignment, again iterate the * least loaded servers, this time giving each one (filling them to * </b>MAX</b>) until we run out. * * <li>All servers will now either host <b>MIN</b> or <b>MAX</b> regions. * * In addition, any server hosting >= <b>MAX</b> regions is guaranteed * to end up with <b>MAX</b> regions at the end of the balancing. This * ensures the minimal number of regions possible are moved. * </ol> * * TODO: We can at-most reassign the number of regions away from a particular * server to be how many they report as most loaded. * Should we just keep all assignment in memory? Any objections? * Does this mean we need HeapSize on HMaster? Or just careful monitor? * (current thinking is we will hold all assignments in memory) * * @param clusterState Map of regionservers and their load/region information to * a list of their most loaded regions * @return a list of regions to be moved, including source and destination, * or null if cluster is already balanced */ public List<RegionPlan> balanceCluster(Map<ServerName, List<HRegionInfo>> clusterState) { boolean emptyRegionServerPresent = false; long startTime = System.currentTimeMillis(); int numServers = clusterState.size(); if (numServers == 0) { LOG.debug("numServers=0 so skipping load balancing"); return null; } NavigableMap<ServerAndLoad, List<HRegionInfo>> serversByLoad = new TreeMap<ServerAndLoad, List<HRegionInfo>>(); int numRegions = 0; // Iterate so we can count regions as we build the map for (Map.Entry<ServerName, List<HRegionInfo>> server : clusterState.entrySet()) { List<HRegionInfo> regions = server.getValue(); int sz = regions.size(); if (sz == 0) emptyRegionServerPresent = true; numRegions += sz; serversByLoad.put(new ServerAndLoad(server.getKey(), sz), regions); } // Check if we even need to do any load balancing float average = (float) numRegions / numServers; // for logging // HBASE-3681 check sloppiness first int floor = (int) Math.floor(average * (1 - slop)); int ceiling = (int) Math.ceil(average * (1 + slop)); if (serversByLoad.lastKey().getLoad() <= ceiling && serversByLoad.firstKey().getLoad() >= floor) { // Skipped because no server outside (min,max) range LOG.info("Skipping load balancing because balanced cluster; " + "servers=" + numServers + " " + "regions=" + numRegions + " average=" + average + " " + "mostloaded=" + serversByLoad.lastKey().getLoad() + " leastloaded=" + serversByLoad.firstKey().getLoad()); return null; } int min = numRegions / numServers; int max = numRegions % numServers == 0 ? min : min + 1; // Using to check balance result. StringBuilder strBalanceParam = new StringBuilder(); strBalanceParam.append("Balance parameter: numRegions=").append(numRegions).append(", numServers=") .append(numServers).append(", max=").append(max).append(", min=").append(min); LOG.debug(strBalanceParam.toString()); // Balance the cluster // TODO: Look at data block locality or a more complex load to do this MinMaxPriorityQueue<RegionPlan> regionsToMove = MinMaxPriorityQueue.orderedBy(rpComparator).create(); List<RegionPlan> regionsToReturn = new ArrayList<RegionPlan>(); // Walk down most loaded, pruning each to the max int serversOverloaded = 0; // flag used to fetch regions from head and tail of list, alternately boolean fetchFromTail = false; Map<ServerName, BalanceInfo> serverBalanceInfo = new TreeMap<ServerName, BalanceInfo>(); for (Map.Entry<ServerAndLoad, List<HRegionInfo>> server : serversByLoad.descendingMap().entrySet()) { ServerAndLoad sal = server.getKey(); int regionCount = sal.getLoad(); if (regionCount <= max) { serverBalanceInfo.put(sal.getServerName(), new BalanceInfo(0, 0)); break; } serversOverloaded++; List<HRegionInfo> regions = server.getValue(); int numToOffload = Math.min(regionCount - max, regions.size()); // account for the out-of-band regions which were assigned to this server // after some other region server crashed Collections.sort(regions, riComparator); int numTaken = 0; for (int i = 0; i <= numToOffload;) { HRegionInfo hri = regions.get(i); // fetch from head if (fetchFromTail) { hri = regions.get(regions.size() - 1 - i); } i++; // Don't rebalance meta regions. if (hri.isMetaRegion()) continue; regionsToMove.add(new RegionPlan(hri, sal.getServerName(), null)); numTaken++; if (numTaken >= numToOffload) break; // fetch in alternate order if there is new region server if (emptyRegionServerPresent) { fetchFromTail = !fetchFromTail; } } serverBalanceInfo.put(sal.getServerName(), new BalanceInfo(numToOffload, (-1) * numTaken)); } int totalNumMoved = regionsToMove.size(); // Walk down least loaded, filling each to the min int neededRegions = 0; // number of regions needed to bring all up to min fetchFromTail = false; Map<ServerName, Integer> underloadedServers = new HashMap<ServerName, Integer>(); for (Map.Entry<ServerAndLoad, List<HRegionInfo>> server : serversByLoad.entrySet()) { int regionCount = server.getKey().getLoad(); if (regionCount >= min) { break; } underloadedServers.put(server.getKey().getServerName(), min - regionCount); } // number of servers that get new regions int serversUnderloaded = underloadedServers.size(); int incr = 1; List<ServerName> sns = Arrays .asList(underloadedServers.keySet().toArray(new ServerName[serversUnderloaded])); Collections.shuffle(sns, RANDOM); while (regionsToMove.size() > 0) { int cnt = 0; int i = incr > 0 ? 0 : underloadedServers.size() - 1; for (; i >= 0 && i < underloadedServers.size(); i += incr) { if (regionsToMove.isEmpty()) break; ServerName si = sns.get(i); int numToTake = underloadedServers.get(si); if (numToTake == 0) continue; addRegionPlan(regionsToMove, fetchFromTail, si, regionsToReturn); if (emptyRegionServerPresent) { fetchFromTail = !fetchFromTail; } underloadedServers.put(si, numToTake - 1); cnt++; BalanceInfo bi = serverBalanceInfo.get(si); if (bi == null) { bi = new BalanceInfo(0, 0); serverBalanceInfo.put(si, bi); } bi.setNumRegionsAdded(bi.getNumRegionsAdded() + 1); } if (cnt == 0) break; // iterates underloadedServers in the other direction incr = -incr; } for (Integer i : underloadedServers.values()) { // If we still want to take some, increment needed neededRegions += i; } // If none needed to fill all to min and none left to drain all to max, // we are done if (neededRegions == 0 && regionsToMove.isEmpty()) { long endTime = System.currentTimeMillis(); LOG.info("Calculated a load balance in " + (endTime - startTime) + "ms. " + "Moving " + totalNumMoved + " regions off of " + serversOverloaded + " overloaded servers onto " + serversUnderloaded + " less loaded servers"); return regionsToReturn; } // Need to do a second pass. // Either more regions to assign out or servers that are still underloaded // If we need more to fill min, grab one from each most loaded until enough if (neededRegions != 0) { // Walk down most loaded, grabbing one from each until we get enough for (Map.Entry<ServerAndLoad, List<HRegionInfo>> server : serversByLoad.descendingMap().entrySet()) { BalanceInfo balanceInfo = serverBalanceInfo.get(server.getKey().getServerName()); int idx = balanceInfo == null ? 0 : balanceInfo.getNextRegionForUnload(); if (idx >= server.getValue().size()) break; HRegionInfo region = server.getValue().get(idx); if (region.isMetaRegion()) continue; // Don't move meta regions. regionsToMove.add(new RegionPlan(region, server.getKey().getServerName(), null)); totalNumMoved++; if (--neededRegions == 0) { // No more regions needed, done shedding break; } } } // Now we have a set of regions that must be all assigned out // Assign each underloaded up to the min, then if leftovers, assign to max // Walk down least loaded, assigning to each to fill up to min for (Map.Entry<ServerAndLoad, List<HRegionInfo>> server : serversByLoad.entrySet()) { int regionCount = server.getKey().getLoad(); if (regionCount >= min) break; BalanceInfo balanceInfo = serverBalanceInfo.get(server.getKey().getServerName()); if (balanceInfo != null) { regionCount += balanceInfo.getNumRegionsAdded(); } if (regionCount >= min) { continue; } int numToTake = min - regionCount; int numTaken = 0; while (numTaken < numToTake && 0 < regionsToMove.size()) { addRegionPlan(regionsToMove, fetchFromTail, server.getKey().getServerName(), regionsToReturn); numTaken++; if (emptyRegionServerPresent) { fetchFromTail = !fetchFromTail; } } } // If we still have regions to dish out, assign underloaded to max if (0 < regionsToMove.size()) { for (Map.Entry<ServerAndLoad, List<HRegionInfo>> server : serversByLoad.entrySet()) { int regionCount = server.getKey().getLoad(); if (regionCount >= max) { break; } addRegionPlan(regionsToMove, fetchFromTail, server.getKey().getServerName(), regionsToReturn); if (emptyRegionServerPresent) { fetchFromTail = !fetchFromTail; } if (regionsToMove.isEmpty()) { break; } } } long endTime = System.currentTimeMillis(); if (!regionsToMove.isEmpty() || neededRegions != 0) { // Emit data so can diagnose how balancer went astray. LOG.warn("regionsToMove=" + totalNumMoved + ", numServers=" + numServers + ", serversOverloaded=" + serversOverloaded + ", serversUnderloaded=" + serversUnderloaded); StringBuilder sb = new StringBuilder(); for (Map.Entry<ServerName, List<HRegionInfo>> e : clusterState.entrySet()) { if (sb.length() > 0) sb.append(", "); sb.append(e.getKey().toString()); sb.append(" "); sb.append(e.getValue().size()); } LOG.warn("Input " + sb.toString()); } // All done! LOG.info("Done. Calculated a load balance in " + (endTime - startTime) + "ms. " + "Moving " + totalNumMoved + " regions off of " + serversOverloaded + " overloaded servers onto " + serversUnderloaded + " less loaded servers"); return regionsToReturn; }