mbox series

[RFC,v1,0/3] x86 / intel_pstate: Set asymmetric CPU capacity on hybrid systems

Message ID 7663799.EvYhyI6sBW@kreacher
Headers show
Series x86 / intel_pstate: Set asymmetric CPU capacity on hybrid systems | expand

Message

Rafael J. Wysocki April 25, 2024, 7:03 p.m. UTC
Hi Everyone,

The purpose of this series is to provide the scheduler with asymmetric CPU
capacity information on x86 hybrid systems based on Intel hardware.

The asymmetric CPU capacity information is important on hybrid systems as it
allows utilization to be computed for tasks in a consistent way across all
CPUs in the system, regardless of their capacity.  This, in turn, allows
the schedutil cpufreq governor to set CPU performance levels consistently
in the cases when tasks migrate between CPUs of different capacities.  It
should also help to improve task placement and load balancing decisions on
hybrid systems and it is key for anything along the lines of EAS.

The information in question comes from the MSR_HWP_CAPABILITIES register and
is provided to the scheduler by the intel_pstate driver, as per the changelog
of patch [3/3].  Patch [2/3] introduces the arch infrastructure needed for
that (in the form of a per-CPU capacity variable) and patch [1/3] is a
preliminary code adjustment.

The changes made by patch [2/3] are very simple, which is why this series is
being sent as an RFC.  Namely, it increases overhead on non-hybrid as well as
on hybrid systems which may be regarded as objectionable, even though the
overhead increase is arguably not significant.  The memory overhead is an
unsigned long variable per CPU which is not a lot IMV and there is also
additional memory access overhead at each arch_scale_cpu_capacity() call site
which I'm not expecting to be noticeable, however.  In any case, the extra
overhead can be avoided at the cost of making the code a bit more complex
(for example, the additional per-CPU memory can be allocated dynamically
on hybrid systems only and a static branch can be used for enabling access
to it when necessary).  I'm just not sure if the extra complexity is really
worth it, so I'd like to know the x86 maintainers' take on this.  If you'd
prefer the overhead to be avoided, please let me know.

Of course, any other feedback on the patches is welcome as well.

Thank you!

Comments

Rafael J. Wysocki April 25, 2024, 7:14 p.m. UTC | #1
On Thursday, April 25, 2024 9:04:48 PM CEST Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Add arch_rebuild_sched_domains() for rebuilding scheduling domains and
> updating topology on x86 and make the ITMT code use it.
> 
> First of all, this reduces code duplication somewhat and eliminates
> a need to use an extern variable, but it will also lay the ground for
> future work related to CPU capacity scaling.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

This obviously is a duplicate of patch [1/3], sorry about this.  My bad.

I'll send the proper patch [2/3] in a reply to this message.

> ---
>  arch/x86/include/asm/topology.h |    6 ++++--
>  arch/x86/kernel/itmt.c          |   12 ++++--------
>  arch/x86/kernel/smpboot.c       |   10 +++++++++-
>  3 files changed, 17 insertions(+), 11 deletions(-)
> 
> Index: linux-pm/arch/x86/include/asm/topology.h
> ===================================================================
> --- linux-pm.orig/arch/x86/include/asm/topology.h
> +++ linux-pm/arch/x86/include/asm/topology.h
> @@ -235,8 +235,6 @@ struct pci_bus;
>  int x86_pci_root_bus_node(int bus);
>  void x86_pci_root_bus_resources(int bus, struct list_head *resources);
>  
> -extern bool x86_topology_update;
> -
>  #ifdef CONFIG_SCHED_MC_PRIO
>  #include <asm/percpu.h>
>  
> @@ -284,9 +282,13 @@ static inline long arch_scale_freq_capac
>  
>  extern void arch_set_max_freq_ratio(bool turbo_disabled);
>  extern void freq_invariance_set_perf_ratio(u64 ratio, bool turbo_disabled);
> +
> +void arch_rebuild_sched_domains(void);
>  #else
>  static inline void arch_set_max_freq_ratio(bool turbo_disabled) { }
>  static inline void freq_invariance_set_perf_ratio(u64 ratio, bool turbo_disabled) { }
> +
> +static inline void arch_rebuild_sched_domains(void) { }
>  #endif
>  
>  extern void arch_scale_freq_tick(void);
> Index: linux-pm/arch/x86/kernel/itmt.c
> ===================================================================
> --- linux-pm.orig/arch/x86/kernel/itmt.c
> +++ linux-pm/arch/x86/kernel/itmt.c
> @@ -54,10 +54,8 @@ static int sched_itmt_update_handler(str
>  	old_sysctl = sysctl_sched_itmt_enabled;
>  	ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
>  
> -	if (!ret && write && old_sysctl != sysctl_sched_itmt_enabled) {
> -		x86_topology_update = true;
> -		rebuild_sched_domains();
> -	}
> +	if (!ret && write && old_sysctl != sysctl_sched_itmt_enabled)
> +		arch_rebuild_sched_domains();
>  
>  	mutex_unlock(&itmt_update_mutex);
>  
> @@ -114,8 +112,7 @@ int sched_set_itmt_support(void)
>  
>  	sysctl_sched_itmt_enabled = 1;
>  
> -	x86_topology_update = true;
> -	rebuild_sched_domains();
> +	arch_rebuild_sched_domains();
>  
>  	mutex_unlock(&itmt_update_mutex);
>  
> @@ -150,8 +147,7 @@ void sched_clear_itmt_support(void)
>  	if (sysctl_sched_itmt_enabled) {
>  		/* disable sched_itmt if we are no longer ITMT capable */
>  		sysctl_sched_itmt_enabled = 0;
> -		x86_topology_update = true;
> -		rebuild_sched_domains();
> +		arch_rebuild_sched_domains();
>  	}
>  
>  	mutex_unlock(&itmt_update_mutex);
> Index: linux-pm/arch/x86/kernel/smpboot.c
> ===================================================================
> --- linux-pm.orig/arch/x86/kernel/smpboot.c
> +++ linux-pm/arch/x86/kernel/smpboot.c
> @@ -39,6 +39,7 @@
>  
>  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>  
> +#include <linux/cpuset.h>
>  #include <linux/init.h>
>  #include <linux/smp.h>
>  #include <linux/export.h>
> @@ -125,7 +126,7 @@ static DEFINE_PER_CPU_ALIGNED(struct mwa
>  int __read_mostly __max_smt_threads = 1;
>  
>  /* Flag to indicate if a complete sched domain rebuild is required */
> -bool x86_topology_update;
> +static bool x86_topology_update;
>  
>  int arch_update_cpu_topology(void)
>  {
> @@ -135,6 +136,13 @@ int arch_update_cpu_topology(void)
>  	return retval;
>  }
>  
> +#ifdef CONFIG_X86_64
> +void arch_rebuild_sched_domains(void) {
> +	x86_topology_update = true;
> +	rebuild_sched_domains();
> +}
> +#endif
> +
>  static unsigned int smpboot_warm_reset_vector_count;
>  
>  static inline void smpboot_setup_warm_reset_vector(unsigned long start_eip)
>