From patchwork Mon Nov 20 12:06:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Kuvyrkov X-Patchwork-Id: 745354 Delivered-To: patch@linaro.org Received: by 2002:a5d:5052:0:b0:32d:baff:b0ca with SMTP id h18csp1240432wrt; Mon, 20 Nov 2023 04:08:11 -0800 (PST) X-Google-Smtp-Source: AGHT+IEsJx3MD/evGKe+ZknZnKLAY50FNh0aHqbdO0pEAdLDASRes5K3RV6gc4Zi6VkoVokAnIj/ X-Received: by 2002:a05:620a:1aa7:b0:77b:d0f0:962b with SMTP id bl39-20020a05620a1aa700b0077bd0f0962bmr12175102qkb.23.1700482091167; Mon, 20 Nov 2023 04:08:11 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700482091; cv=pass; d=google.com; s=arc-20160816; b=MeywHnvSfbIkA1YyKUx+fMy3C6cCKo8YAJVMOSXllMZSYTDGClUk9lhcJXGn95WO4h YcQoCNIzINRIrE9UOveOXM3uiBs0MimWj0mtv5UJWPLXS0sAL6k9HHyBLIHh++vlJ8eE HRQnG7O7Y2UvEyqx3k/KDCX4loTKAr0DtJlcJfH5vKywbQSHsyL0fHCQV7cu0/0sp6qM pQoei02kSWYz8aMXKUKBy2VQWY4PXjr4/hRi3HGYk87fWdma4kA27+6lkH3nbYF2cgCD G5URD9j0iPRNJJ41YV27iAwGUczaB89wQ2Y98JFdSG0VUEVRPzI2+bHAQkqTefjGh68E M2rQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=XWSaJJ/f8+1zQVVK8Gk1mrKZPYAA8REpQy73jsEi7HY=; fh=98edGTG9zj/Nk59bsyQAUrc4IlBzGyUrREu3W2WgiHA=; b=YvowO5+y77CMHEBFvg0bOcZWPjLVehbGbqBaypwkEK44G8uWffM9VLAc3BnU0srzKC uZJfiJi5qqkaEVUMcRB55pt0mpwe54IZXdRyGphJ0vkr3H/X93BApgBZ/awmvVpKIkbi 7w24ZLEiPUtdG8LgcxWZv07xK4DF9iBU0iS9ohvq7/Hmje9EI/2zb1D/UPL5Eh2l/J7v XPWlAB55TCNr1iU50zV7KEotvd6vGuUjWn64oYiRMmBZJRGpn2Un5kbqgddAEWKOgcO/ 9j2FukEYRZcEiibv+LYeo7lL24RyKSWqeeprATry/HI+HYriq2yWBR/g9RQc/s2016kC rpIg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="Yu/rTRBj"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+patch=linaro.org@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+patch=linaro.org@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id n3-20020a05620a294300b007757beeb5acsi8892685qkp.445.2023.11.20.04.08.10 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Nov 2023 04:08:11 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+patch=linaro.org@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="Yu/rTRBj"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+patch=linaro.org@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+patch=linaro.org@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C1799386182C for ; Mon, 20 Nov 2023 12:08:10 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lj1-x231.google.com (mail-lj1-x231.google.com [IPv6:2a00:1450:4864:20::231]) by sourceware.org (Postfix) with ESMTPS id CBA0A3860769 for ; Mon, 20 Nov 2023 12:07:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CBA0A3860769 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CBA0A3860769 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::231 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700482059; cv=none; b=KwrpbkIItA2jErX74HmiKQIaOfAKyJKY5YDBCQ9y4qATTDx2RnOnxRl9jZ1QXFvZMkFZY44eqiL9Y6ooXXtVz8dGHyA4mkBks1x8sHK2o7vohili5uUEParAzUuzkGOQh8R1ufpG3Dkwxuip9qtQf3MFEXjaniMFlCNtRos8sAc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700482059; c=relaxed/simple; bh=OC6mDSE1mT5OARjETGIplcg+LoUqzO7tIr/9EMh2aCY=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=oJdbyvP91fXnGM9IwEHtsZfip8gHDYb9WIdP3MaIaJa/4aBSjwQ6t5F+d1ceZJrT6uXs7x0naZHPYHKwQqQHlU8y7NS1z9NwsV/90/flcSLQV5sPw0Zx2+QjNlY7YrQD/YeWO6nuFqEMluODrGq44rGKh84RKAUQ6uwJOOsrJH4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lj1-x231.google.com with SMTP id 38308e7fff4ca-2c83c0ad7e6so12268311fa.1 for ; Mon, 20 Nov 2023 04:07:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1700482055; x=1701086855; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XWSaJJ/f8+1zQVVK8Gk1mrKZPYAA8REpQy73jsEi7HY=; b=Yu/rTRBjClTh7Pd9fT0Fhcc27dPN1JyBnsriQpxPQRTBCHpb/jduv35ryO/kJza52q 97Pxaf4koxQMGcAkOaoqKnENc04H4jR796mi97bTXlvJ8Kjn5q/OnDvSSTXzdOSm63po 15NFtSaDb9LLERap88Lz0oxlZwUccvvoNj8AvmIYCSzhBRVrUdw+z1H903uNWRHUIMyv GQlMzy2/2Y81KfsMEHU/XBdHekudclpcQBAYBLcmngYGgBMfC/yaRZBx514fGJK1I37s JX4HlXbjKy0HbQ3oP6EPdGVeIkz0nfTkoGJWiuGpZCBMRiBajMQ/CV9uAkqrn6qp0zTb AWkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700482055; x=1701086855; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XWSaJJ/f8+1zQVVK8Gk1mrKZPYAA8REpQy73jsEi7HY=; b=ayFMzFpEI6XFUK7rOtVgmRPNeyqhr22aVh5SJXMYyq5maWAUcr0KhyDVxcYmHcQg2x QXdaKMgSpfrJr5r90AolQIi/VxomwNqMYsk4jg0W5/4M39BGl2SpXV1NjJ2X9rJlm5v5 DqcI6YpoI9gMRIxqEIyfU+xtMJFa+ihTbWF0CsKu6SsAE/KZSbAMackV72LY4GBdN0dQ +jAKWHNNNY80NKNvmRHsmzF9c+sbtYEBYo8G950sM1cA6dtZWs91AbCAE+HwnD0NW2MW q1rhYWHbuKOVAOnH0bk//GacHeTu5GjLXuozXdX4bCb/3MdcPhDOqZL7a34PZSSITdFP LSRw== X-Gm-Message-State: AOJu0YwDhOwjQgO2O8KC8sQ5aYVB6c/UzRzBH0a1vJ2+9H2jAURs4z2d sgFKaiG4bkJBfu++6Qsa9K/PLORQN1vMauDoYKzF X-Received: by 2002:a05:651c:2048:b0:2c5:36e:31bf with SMTP id t8-20020a05651c204800b002c5036e31bfmr3589258ljo.5.1700482055425; Mon, 20 Nov 2023 04:07:35 -0800 (PST) Received: from localhost.localdomain (static.225.72.216.95.clients.your-server.de. [95.216.72.225]) by smtp.gmail.com with ESMTPSA id p22-20020a2ea4d6000000b002c50b040e94sm1045559ljm.85.2023.11.20.04.07.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Nov 2023 04:07:34 -0800 (PST) From: Maxim Kuvyrkov To: gcc-patches@gcc.gnu.org Cc: Maxim Kuvyrkov , Bernd Schmidt , Vladimir Makarov , Jeff Law Subject: [PATCH 1/1] sched-deps.cc (find_modifiable_mems): Avoid exponential behavior Date: Mon, 20 Nov 2023 12:06:49 +0000 Message-Id: <20231120120649.672893-2-maxim.kuvyrkov@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231120120649.672893-1-maxim.kuvyrkov@linaro.org> References: <20231120120649.672893-1-maxim.kuvyrkov@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patch=linaro.org@gcc.gnu.org This patch avoids sched-deps.cc:find_inc() creating exponential number of dependencies, which become memory and compilation time hogs. Consider example (simplified from PR96388) ... === sp=sp-4 // sp_insnA mem_insnA1[sp+A1] ... mem_insnAN[sp+AN] sp=sp-4 // sp_insnB mem_insnB1[sp+B1] ... mem_insnBM[sp+BM] === ... in this example find_modifiable_mems() will arrange for mem_insnA* to be able to pass sp_insnA, and, while doing this, will create dependencies between all mem_insnA*s and sp_insnB -- because sp_insnB is a consumer of sp_insnA. After this sp_insnB will have N new backward dependencies. Then find_modifiable_mems() gets to mem_insnB*s and starts to create N new dependencies for _every_ mem_insnB*. This gets us N*M new dependencies. In PR96833's testcase N and M are 10k-15k, which causes RAM usage of 30GB and compilation time of 30 minutes, with sched2 accounting for 95% of both metrics. After this patch the RAM usage is down to 1GB and compilation time is down to 3-4 minutes, with sched2 no longer standing out on -ftime-report or memory usage. gcc/ChangeLog: PR rtl-optimization/96388 PR rtl-optimization/111554 * sched-deps.cc (find_inc): Avoid exponential behavior. --- gcc/sched-deps.cc | 45 +++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 41 insertions(+), 4 deletions(-) diff --git a/gcc/sched-deps.cc b/gcc/sched-deps.cc index c23218890f3..397bb9fd462 100644 --- a/gcc/sched-deps.cc +++ b/gcc/sched-deps.cc @@ -4779,24 +4779,59 @@ parse_add_or_inc (struct mem_inc_info *mii, rtx_insn *insn, bool before_mem) /* Once a suitable mem reference has been found and the corresponding data in MII has been filled in, this function is called to find a suitable add or inc insn involving the register we found in the memory - reference. */ + reference. + If successful, this function will create additional dependencies between + - mii->inc_insn's producers and mii->mem_insn as a consumer (if backwards) + - mii->inc_insn's consumers and mii->mem_insn as a producer (if !backwards). +*/ static bool find_inc (struct mem_inc_info *mii, bool backwards) { sd_iterator_def sd_it; dep_t dep; + sd_list_types_def mem_deps = backwards ? SD_LIST_HARD_BACK : SD_LIST_FORW; + int n_mem_deps = sd_lists_size (mii->mem_insn, mem_deps); - sd_it = sd_iterator_start (mii->mem_insn, - backwards ? SD_LIST_HARD_BACK : SD_LIST_FORW); + sd_it = sd_iterator_start (mii->mem_insn, mem_deps); while (sd_iterator_cond (&sd_it, &dep)) { dep_node_t node = DEP_LINK_NODE (*sd_it.linkp); rtx_insn *pro = DEP_PRO (dep); rtx_insn *con = DEP_CON (dep); - rtx_insn *inc_cand = backwards ? pro : con; + rtx_insn *inc_cand; + int n_inc_deps; + + if (backwards) + { + inc_cand = pro; + n_inc_deps = sd_lists_size (inc_cand, SD_LIST_BACK); + } + else + { + inc_cand = con; + n_inc_deps = sd_lists_size (inc_cand, SD_LIST_FORW); + } + + /* In the FOR_EACH_DEP loop below we will create additional n_inc_deps + for mem_insn. This by itself is not a problem, since each mem_insn + will have only a few inc_insns associated with it. However, if + we consider that a single inc_insn may have a lot of mem_insns, AND, + on top of that, a few other inc_insns associated with it -- + those _other inc_insns_ will get (n_mem_deps * number of MEM insns) + dependencies created for them. This may cause an exponential + growth of memory usage and scheduling time. + See PR96388 for details. + We [heuristically] use n_inc_deps as a proxy for the number of MEM + insns, and drop opportunities for breaking modifiable_mem dependencies + when dependency lists grow beyond reasonable size. */ + if (n_mem_deps * n_inc_deps + >= param_max_pending_list_length * param_max_pending_list_length) + goto next; + if (DEP_NONREG (dep) || DEP_MULTIPLE (dep)) goto next; + if (parse_add_or_inc (mii, inc_cand, backwards)) { struct dep_replacement *desc; @@ -4838,6 +4873,8 @@ find_inc (struct mem_inc_info *mii, bool backwards) desc->insn = mii->mem_insn; move_dep_link (DEP_NODE_BACK (node), INSN_HARD_BACK_DEPS (con), INSN_SPEC_BACK_DEPS (con)); + + gcc_assert (mii->inc_insn == inc_cand); if (backwards) { FOR_EACH_DEP (mii->inc_insn, SD_LIST_BACK, sd_it, dep)