Projects
openEuler:24.03:SP1:Everything
kata-containers
Sign Up
Log In
Username
Password
We truncated the diff of some files because they were too big. If you want to see the full diff for every file,
click here
.
Overview
Repositories
Revisions
Requests
Users
Attributes
Meta
Expand all
Collapse all
Changes of Revision 9
View file
_service:recompress:tar_scm:kernel.tar.gz/Documentation/ABI/testing/sysfs-devices-platform-kunpeng_hccs
Changed
@@ -79,3 +79,48 @@ indicates a lane. crc_err_cnt: (RO) CRC err count on this port. ============= ==== ============================================= + +What: /sys/devices/platform/HISI04Bx:00/used_types +Date: August 2024 +KernelVersion: 6.12 +Contact: Huisong Li <lihuisong@huawei.com> +Description: + This interface is used to show all HCCS types used on the + platform, like, HCCS-v1, HCCS-v2 and so on. + +What: /sys/devices/platform/HISI04Bx:00/available_inc_dec_lane_types +What: /sys/devices/platform/HISI04Bx:00/dec_lane_of_type +What: /sys/devices/platform/HISI04Bx:00/inc_lane_of_type +Date: August 2024 +KernelVersion: 6.12 +Contact: Huisong Li <lihuisong@huawei.com> +Description: + These interfaces under /sys/devices/platform/HISI04Bx/ are + used to support the low power consumption feature of some + HCCS types by changing the number of lanes used. The interfaces + changing the number of lanes used are 'dec_lane_of_type' and + 'inc_lane_of_type' which require root privileges. These + interfaces aren't exposed if no HCCS type on platform support + this feature. Please note that decreasing lane number is only + allowed if all the specified HCCS ports are not busy. + + The low power consumption interfaces are as follows: + + ============================= ==== ================================ + available_inc_dec_lane_types: (RO) available HCCS types (string) to + increase and decrease the number + of lane used, e.g. HCCS-v2. + dec_lane_of_type: (WO) input HCCS type supported + decreasing lane to decrease the + used lane number of all specified + HCCS type ports on platform to + the minimum. + You can query the 'cur_lane_num' + to get the minimum lane number + after executing successfully. + inc_lane_of_type: (WO) input HCCS type supported + increasing lane to increase the + used lane number of all specified + HCCS type ports on platform to + the full lane state. + ============================= ==== ================================
View file
_service:recompress:tar_scm:kernel.tar.gz/Documentation/arch/arm64/booting.rst
Changed
@@ -349,6 +349,27 @@ - HWFGWTR_EL2.nSMPRI_EL1 (bit 54) must be initialised to 0b01. + For CPUs with feature Branch Record Buffer Extension (FEAT_BRBE): + + - If EL3 is present: + + - MDCR_EL3.SBRBE (bits 33:32) must be initialised to 0b11. + + - If the kernel is entered at EL1 and EL2 is present: + + - BRBCR_EL2.CC (bit 3) must be initialised to 0b1. + - BRBCR_EL2.MPRED (bit 4) must be initialised to 0b1. + + - HDFGRTR_EL2.nBRBDATA (bit 61) must be initialised to 0b1. + - HDFGRTR_EL2.nBRBCTL (bit 60) must be initialised to 0b1. + - HDFGRTR_EL2.nBRBIDR (bit 59) must be initialised to 0b1. + + - HDFGWTR_EL2.nBRBDATA (bit 61) must be initialised to 0b1. + - HDFGWTR_EL2.nBRBCTL (bit 60) must be initialised to 0b1. + + - HFGITR_EL2.nBRBIALL (bit 56) must be initialised to 0b1. + - HFGITR_EL2.nBRBINJ (bit 55) must be initialised to 0b1. + For CPUs with the Scalable Matrix Extension FA64 feature (FEAT_SME_FA64): - If EL3 is present:
View file
_service:recompress:tar_scm:kernel.tar.gz/Documentation/arch/arm64/silicon-errata.rst
Changed
@@ -202,8 +202,9 @@ +----------------+-----------------+-----------------+-----------------------------+ | Hisilicon | Hip08 SMMU PMCG | #162001800 | N/A | +----------------+-----------------+-----------------+-----------------------------+ -| Hisilicon | Hip{08,09,10,10C| #162001900 | N/A | -| | ,11} SMMU PMCG | | | +| Hisilicon | Hip{08,09,09A, | #162001900 | N/A | +| | 10,10C,11} | | | +| | SMMU PMCG | | | +----------------+-----------------+-----------------+-----------------------------+ | Hisilicon | LINXICORE9100 | #162100125 | HISILICON_ERRATUM_162100125 | +----------------+-----------------+-----------------+-----------------------------+
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/configs/openeuler_defconfig
Changed
@@ -1232,6 +1232,7 @@ CONFIG_NET_XGRESS=y CONFIG_NET_REDIRECT=y CONFIG_SKB_EXTENSIONS=y +# CONFIG_ETH_CAQM is not set # # Networking options @@ -6840,6 +6841,7 @@ CONFIG_ARM_PMU_ACPI=y CONFIG_ARM_SMMU_V3_PMU=m CONFIG_ARM_PMUV3=y +CONFIG_ARM64_BRBE=y # CONFIG_ARM_DSU_PMU is not set CONFIG_QCOM_L2_PMU=y CONFIG_QCOM_L3_PMU=y
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/include/asm/el2_setup.h
Changed
@@ -155,6 +155,40 @@ .Lskip_set_cptr_\@: .endm +/* + * Enable BRBE to record cycle counts and branch mispredicts. + * + * At any EL, to record cycle counts BRBE requires that both + * BRBCR_EL2.CC=1 and BRBCR_EL1.CC=1. + * + * At any EL, to record branch mispredicts BRBE requires that both + * BRBCR_EL2.MPRED=1 and BRBCR_EL1.MPRED=1. + * + * When HCR_EL2.E2H=1, the BRBCR_EL1 encoding is redirected to + * BRBCR_EL2, but the {CC,MPRED} bits in the real BRBCR_EL1 register + * still apply. + * + * Set {CC,MPRBED} in both BRBCR_EL2 and BRBCR_EL1 so that at runtime we + * only need to enable/disable thse in BRBCR_EL1 regardless of whether + * the kernel ends up executing in EL1 or EL2. + */ +.macro __init_el2_brbe + mrs x1, id_aa64dfr0_el1 + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4 + cbz x1, .Lskip_brbe_\@ + + mov_q x0, BRBCR_ELx_CC | BRBCR_ELx_MPRED + msr_s SYS_BRBCR_EL2, x0 + + __check_hvhe .Lset_brbe_nvhe_\@, x1 + msr_s SYS_BRBCR_EL12, x0 // VHE + b .Lskip_brbe_\@ + +.Lset_brbe_nvhe_\@: + msr_s SYS_BRBCR_EL1, x0 // NVHE +.Lskip_brbe_\@: +.endm + /* Disable any fine grained traps */ .macro __init_el2_fgt mrs x1, id_aa64mmfr0_el1 @@ -162,16 +196,48 @@ cbz x1, .Lskip_fgt_\@ mov x0, xzr + mov x2, xzr mrs x1, id_aa64dfr0_el1 ubfx x1, x1, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4 cmp x1, #3 b.lt .Lset_debug_fgt_\@ + /* Disable PMSNEVFR_EL1 read and write traps */ - orr x0, x0, #(1 << 62) + orr x0, x0, #HDFGRTR_EL2_nPMSNEVFR_EL1_MASK + orr x2, x2, #HDFGWTR_EL2_nPMSNEVFR_EL1_MASK .Lset_debug_fgt_\@: +#ifdef CONFIG_ARM64_BRBE + mrs x1, id_aa64dfr0_el1 + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4 + cbz x1, .Lskip_brbe_reg_fgt_\@ + + /* + * Disable read traps for the following registers + * + * BRBSRC|BRBTGT|RBINF_EL1 + * BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS_EL1 + */ + orr x0, x0, #HDFGRTR_EL2_nBRBDATA_MASK + + /* + * Disable write traps for the following registers + * + * BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS_EL1 + */ + orr x2, x2, #HDFGWTR_EL2_nBRBDATA_MASK + + /* Disable read and write traps for BRBCR|BRBFCR_EL1 */ + orr x0, x0, #HDFGRTR_EL2_nBRBCTL_MASK + orr x2, x2, #HDFGWTR_EL2_nBRBCTL_MASK + + /* Disable read traps for BRBIDR_EL1 */ + orr x0, x0, #HDFGRTR_EL2_nBRBIDR_MASK + +.Lskip_brbe_reg_fgt_\@: +#endif /* CONFIG_ARM64_BRBE */ msr_s SYS_HDFGRTR_EL2, x0 - msr_s SYS_HDFGWTR_EL2, x0 + msr_s SYS_HDFGWTR_EL2, x2 mov x0, xzr mrs x1, id_aa64pfr1_el1 @@ -194,7 +260,21 @@ .Lset_fgt_\@: msr_s SYS_HFGRTR_EL2, x0 msr_s SYS_HFGWTR_EL2, x0 - msr_s SYS_HFGITR_EL2, xzr + mov x0, xzr +#ifdef CONFIG_ARM64_BRBE + mrs x1, id_aa64dfr0_el1 + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4 + cbz x1, .Lskip_brbe_insn_fgt_\@ + + /* Disable traps for BRBIALL instruction */ + orr x0, x0, #HFGITR_EL2_nBRBIALL_MASK + + /* Disable traps for BRBINJ instruction */ + orr x0, x0, #HFGITR_EL2_nBRBINJ_MASK + +.Lskip_brbe_insn_fgt_\@: +#endif /* CONFIG_ARM64_BRBE */ + msr_s SYS_HFGITR_EL2, x0 mrs x1, id_aa64pfr0_el1 // AMU traps UNDEF without AMU ubfx x1, x1, #ID_AA64PFR0_EL1_AMU_SHIFT, #4 @@ -229,6 +309,7 @@ __init_el2_nvhe_idregs __init_el2_cptr __init_el2_fgt + __init_el2_brbe .endm #ifndef __KVM_NVHE_HYPERVISOR__
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/include/asm/kvm_emulate.h
Changed
@@ -616,7 +616,7 @@ val = (CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN); if (!vcpu_has_sve(vcpu) || - (vcpu->arch.fp_state != FP_STATE_GUEST_OWNED)) + (*host_data_ptr(fp_owner) != FP_STATE_GUEST_OWNED)) val |= CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN; if (cpus_have_final_cap(ARM64_SME)) val |= CPACR_EL1_SMEN_EL1EN | CPACR_EL1_SMEN_EL0EN; @@ -624,7 +624,7 @@ val = CPTR_NVHE_EL2_RES1; if (vcpu_has_sve(vcpu) && - (vcpu->arch.fp_state == FP_STATE_GUEST_OWNED)) + (*host_data_ptr(fp_owner) == FP_STATE_GUEST_OWNED)) val |= CPTR_EL2_TZ; if (cpus_have_final_cap(ARM64_SME)) val &= ~CPTR_EL2_TSM;
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/include/asm/kvm_host.h
Changed
@@ -450,8 +450,43 @@ struct kvm_vcpu *__hyp_running_vcpu; }; +/* + * This structure is instantiated on a per-CPU basis, and contains + * data that is: + * + * - tied to a single physical CPU, and + * - either have a lifetime that does not extend past vcpu_put() + * - or is an invariant for the lifetime of the system + * + * Use host_data_ptr(field) as a way to access a pointer to such a + * field. + */ struct kvm_host_data { struct kvm_cpu_context host_ctxt; + struct user_fpsimd_state *fpsimd_state; /* hyp VA */ + + /* Ownership of the FP regs */ + enum { + FP_STATE_FREE, + FP_STATE_HOST_OWNED, + FP_STATE_GUEST_OWNED, + } fp_owner; + + /* + * host_debug_state contains the host registers which are + * saved and restored during world switches. + */ + struct { + /* {Break,watch}point registers */ + struct kvm_guest_debug_arch regs; + /* Statistical profiling extension */ + u64 pmscr_el1; + /* Self-hosted trace */ + u64 trfcr_el1; + /* Values of trap registers for the host before guest entry. */ + u64 mdcr_el2; + u64 brbcr_el1; + } host_debug_state; }; struct kvm_host_psci_config { @@ -510,19 +545,9 @@ u64 mdcr_el2; u64 cptr_el2; - /* Values of trap registers for the host before guest entry. */ - u64 mdcr_el2_host; - /* Exception Information */ struct kvm_vcpu_fault_info fault; - /* Ownership of the FP regs */ - enum { - FP_STATE_FREE, - FP_STATE_HOST_OWNED, - FP_STATE_GUEST_OWNED, - } fp_state; - /* Configuration flags, set once and for all before the vcpu can run */ u8 cflags; @@ -545,11 +570,10 @@ * We maintain more than a single set of debug registers to support * debugging the guest from the host and to maintain separate host and * guest state during world switches. vcpu_debug_state are the debug - * registers of the vcpu as the guest sees them. host_debug_state are - * the host registers which are saved and restored during - * world switches. external_debug_state contains the debug - * values we want to debug the guest. This is set via the - * KVM_SET_GUEST_DEBUG ioctl. + * registers of the vcpu as the guest sees them. + * + * external_debug_state contains the debug values we want to debug the + * guest. This is set via the KVM_SET_GUEST_DEBUG ioctl. * * debug_ptr points to the set of debug registers that should be loaded * onto the hardware when running the guest. @@ -558,18 +582,8 @@ struct kvm_guest_debug_arch vcpu_debug_state; struct kvm_guest_debug_arch external_debug_state; - struct user_fpsimd_state *host_fpsimd_state; /* hyp VA */ struct task_struct *parent_task; - struct { - /* {Break,watch}point registers */ - struct kvm_guest_debug_arch regs; - /* Statistical profiling extension */ - u64 pmscr_el1; - /* Self-hosted trace */ - u64 trfcr_el1; - } host_debug_state; - /* VGIC state */ struct vgic_cpu vgic_cpu; struct arch_timer_cpu timer_cpu; @@ -755,8 +769,8 @@ #define DEBUG_STATE_SAVE_SPE __vcpu_single_flag(iflags, BIT(5)) /* Save TRBE context if active */ #define DEBUG_STATE_SAVE_TRBE __vcpu_single_flag(iflags, BIT(6)) -/* vcpu running in HYP context */ -#define VCPU_HYP_CONTEXT __vcpu_single_flag(iflags, BIT(7)) +/* Save BRBE context if active */ +#define DEBUG_STATE_SAVE_BRBE __vcpu_single_flag(iflags, BIT(7)) /* SVE enabled for host EL0 */ #define HOST_SVE_ENABLED __vcpu_single_flag(sflags, BIT(0)) @@ -1129,6 +1143,32 @@ DECLARE_KVM_HYP_PER_CPU(struct kvm_host_data, kvm_host_data); +/* + * How we access per-CPU host data depends on the where we access it from, + * and the mode we're in: + * + * - VHE and nVHE hypervisor bits use their locally defined instance + * + * - the rest of the kernel use either the VHE or nVHE one, depending on + * the mode we're running in. + * + * Unless we're in protected mode, fully deprivileged, and the nVHE + * per-CPU stuff is exclusively accessible to the protected EL2 code. + * In this case, the EL1 code uses the *VHE* data as its private state + * (which makes sense in a way as there shouldn't be any shared state + * between the host and the hypervisor). + * + * Yes, this is all totally trivial. Shoot me now. + */ +#if defined(__KVM_NVHE_HYPERVISOR__) || defined(__KVM_VHE_HYPERVISOR__) +#define host_data_ptr(f) (&this_cpu_ptr(&kvm_host_data)->f) +#else +#define host_data_ptr(f) \ + (static_branch_unlikely(&kvm_protected_mode_initialized) ? \ + &this_cpu_ptr(&kvm_host_data)->f : \ + &this_cpu_ptr_hyp_sym(kvm_host_data)->f) +#endif + static inline void kvm_init_host_cpu_context(struct kvm_cpu_context *cpu_ctxt) { /* The host's MPIDR is immutable, so let's set it up at boot time */
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/include/asm/sysreg.h
Changed
@@ -197,16 +197,8 @@ #define SYS_DBGVCR32_EL2 sys_reg(2, 4, 0, 7, 0) #define SYS_BRBINF_EL1(n) sys_reg(2, 1, 8, (n & 15), (((n & 16) >> 2) | 0)) -#define SYS_BRBINFINJ_EL1 sys_reg(2, 1, 9, 1, 0) #define SYS_BRBSRC_EL1(n) sys_reg(2, 1, 8, (n & 15), (((n & 16) >> 2) | 1)) -#define SYS_BRBSRCINJ_EL1 sys_reg(2, 1, 9, 1, 1) #define SYS_BRBTGT_EL1(n) sys_reg(2, 1, 8, (n & 15), (((n & 16) >> 2) | 2)) -#define SYS_BRBTGTINJ_EL1 sys_reg(2, 1, 9, 1, 2) -#define SYS_BRBTS_EL1 sys_reg(2, 1, 9, 0, 2) - -#define SYS_BRBCR_EL1 sys_reg(2, 1, 9, 0, 0) -#define SYS_BRBFCR_EL1 sys_reg(2, 1, 9, 0, 1) -#define SYS_BRBIDR0_EL1 sys_reg(2, 1, 9, 2, 0) #define SYS_TRCITECR_EL1 sys_reg(3, 0, 1, 2, 3) #define SYS_TRCACATR(m) sys_reg(2, 1, 2, ((m & 7) << 1), (2 | (m >> 3))) @@ -488,6 +480,7 @@ #define SYS_SCTLR_EL2 sys_reg(3, 4, 1, 0, 0) #define SYS_ACTLR_EL2 sys_reg(3, 4, 1, 0, 1) +#define SYS_SCTLR2_EL2 sys_reg(3, 4, 1, 0, 3) #define SYS_HCR_EL2 sys_reg(3, 4, 1, 1, 0) #define SYS_MDCR_EL2 sys_reg(3, 4, 1, 1, 1) #define SYS_CPTR_EL2 sys_reg(3, 4, 1, 1, 2) @@ -501,6 +494,7 @@ #define SYS_VTCR_EL2 sys_reg(3, 4, 2, 1, 2) #define SYS_TRFCR_EL2 sys_reg(3, 4, 1, 2, 1) +#define SYS_VNCR_EL2 sys_reg(3, 4, 2, 2, 0) #define SYS_HAFGRTR_EL2 sys_reg(3, 4, 3, 1, 6) #define SYS_SPSR_EL2 sys_reg(3, 4, 4, 0, 0) #define SYS_ELR_EL2 sys_reg(3, 4, 4, 0, 1) @@ -573,25 +567,49 @@ #define SYS_CONTEXTIDR_EL2 sys_reg(3, 4, 13, 0, 1) #define SYS_TPIDR_EL2 sys_reg(3, 4, 13, 0, 2) +#define SYS_SCXTNUM_EL2 sys_reg(3, 4, 13, 0, 7) + +#define __AMEV_op2(m) (m & 0x7) +#define __AMEV_CRm(n, m) (n | ((m & 0x8) >> 3)) +#define __SYS__AMEVCNTVOFF0n_EL2(m) sys_reg(3, 4, 13, __AMEV_CRm(0x8, m), __AMEV_op2(m)) +#define SYS_AMEVCNTVOFF0n_EL2(m) __SYS__AMEVCNTVOFF0n_EL2(m) +#define __SYS__AMEVCNTVOFF1n_EL2(m) sys_reg(3, 4, 13, __AMEV_CRm(0xA, m), __AMEV_op2(m)) +#define SYS_AMEVCNTVOFF1n_EL2(m) __SYS__AMEVCNTVOFF1n_EL2(m) #define SYS_CNTVOFF_EL2 sys_reg(3, 4, 14, 0, 3) #define SYS_CNTHCTL_EL2 sys_reg(3, 4, 14, 1, 0) +#define SYS_CNTHP_TVAL_EL2 sys_reg(3, 4, 14, 2, 0) +#define SYS_CNTHP_CTL_EL2 sys_reg(3, 4, 14, 2, 1) +#define SYS_CNTHP_CVAL_EL2 sys_reg(3, 4, 14, 2, 2) +#define SYS_CNTHV_TVAL_EL2 sys_reg(3, 4, 14, 3, 0) +#define SYS_CNTHV_CTL_EL2 sys_reg(3, 4, 14, 3, 1) +#define SYS_CNTHV_CVAL_EL2 sys_reg(3, 4, 14, 3, 2) /* VHE encodings for architectural EL0/1 system registers */ #define SYS_SCTLR_EL12 sys_reg(3, 5, 1, 0, 0) +#define SYS_CPACR_EL12 sys_reg(3, 5, 1, 0, 2) +#define SYS_SCTLR2_EL12 sys_reg(3, 5, 1, 0, 3) +#define SYS_ZCR_EL12 sys_reg(3, 5, 1, 2, 0) +#define SYS_TRFCR_EL12 sys_reg(3, 5, 1, 2, 1) +#define SYS_SMCR_EL12 sys_reg(3, 5, 1, 2, 6) #define SYS_TTBR0_EL12 sys_reg(3, 5, 2, 0, 0) #define SYS_TTBR1_EL12 sys_reg(3, 5, 2, 0, 1) #define SYS_TCR_EL12 sys_reg(3, 5, 2, 0, 2) +#define SYS_TCR2_EL12 sys_reg(3, 5, 2, 0, 3) #define SYS_SPSR_EL12 sys_reg(3, 5, 4, 0, 0) #define SYS_ELR_EL12 sys_reg(3, 5, 4, 0, 1) #define SYS_AFSR0_EL12 sys_reg(3, 5, 5, 1, 0) #define SYS_AFSR1_EL12 sys_reg(3, 5, 5, 1, 1) #define SYS_ESR_EL12 sys_reg(3, 5, 5, 2, 0) #define SYS_TFSR_EL12 sys_reg(3, 5, 5, 6, 0) +#define SYS_FAR_EL12 sys_reg(3, 5, 6, 0, 0) +#define SYS_PMSCR_EL12 sys_reg(3, 5, 9, 9, 0) #define SYS_MAIR_EL12 sys_reg(3, 5, 10, 2, 0) #define SYS_AMAIR_EL12 sys_reg(3, 5, 10, 3, 0) #define SYS_MPAM1_EL12 sys_reg(3, 5, 10, 5, 0) #define SYS_VBAR_EL12 sys_reg(3, 5, 12, 0, 0) +#define SYS_CONTEXTIDR_EL12 sys_reg(3, 5, 13, 0, 1) +#define SYS_SCXTNUM_EL12 sys_reg(3, 5, 13, 0, 7) #define SYS_CNTKCTL_EL12 sys_reg(3, 5, 14, 1, 0) #define SYS_CNTP_TVAL_EL02 sys_reg(3, 5, 14, 2, 0) #define SYS_CNTP_CTL_EL02 sys_reg(3, 5, 14, 2, 1) @@ -754,6 +772,12 @@ #define OP_DVP_RCTX sys_insn(1, 3, 7, 3, 5) #define OP_CPP_RCTX sys_insn(1, 3, 7, 3, 7) +/* + * BRBE Instructions + */ +#define BRB_IALL_INSN __emit_inst(0xd5000000 | OP_BRB_IALL | (0x1f)) +#define BRB_INJ_INSN __emit_inst(0xd5000000 | OP_BRB_INJ | (0x1f)) + /* Common SCTLR_ELx flags. */ #define SCTLR_ELx_ENTP2 (BIT(60)) #define SCTLR_ELx_DSSBS (BIT(44))
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/include/asm/virtcca_cvm_guest.h
Changed
@@ -18,6 +18,8 @@ extern void virtcca_cvm_tsi_init(void); +extern void swiotlb_unmap_notify(unsigned long paddr, unsigned long size); + #else static inline int set_cvm_memory_encrypted(unsigned long addr, int numpages) @@ -39,5 +41,6 @@ static inline void virtcca_cvm_tsi_init(void) {} +static inline void swiotlb_unmap_notify(unsigned long paddr, unsigned long size) {} #endif /* CONFIG_HISI_VIRTCCA_GUEST */ #endif /* __VIRTCCA_CVM_GUEST_H */
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/include/asm/virtcca_cvm_smc.h
Changed
@@ -68,6 +68,13 @@ */ #define SMC_TSI_DEVICE_CERT SMC_TSI_FID(0x19A) +/* + * arg0: Paddr of rd + * arg1: Paddr of memory to unmap + * arg2: Size of memory to unmap + */ + #define SMC_TSI_SEC_MEM_UNMAP SMC_TSI_FID(0x19C) + static inline unsigned long tsi_get_version(void) { struct arm_smccc_res res;
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kernel/virtcca_cvm_guest.c
Changed
@@ -7,6 +7,7 @@ #include <linux/module.h> #include <linux/sched.h> #include <linux/vmalloc.h> +#include <linux/swiotlb.h> #include <asm/cacheflush.h> #include <asm/set_memory.h> @@ -69,6 +70,7 @@ { return cvm_guest_enable && static_branch_likely(&cvm_tsi_present); } +EXPORT_SYMBOL_GPL(is_virtcca_cvm_world); static int change_page_range_cvm(pte_t *ptep, unsigned long addr, void *data) { @@ -119,3 +121,36 @@ { return __set_memory_encrypted(addr, numpages, false); } + +/* + * struct io_tlb_no_swiotlb_mem - whether use the + * bounce buffer mechanism or not + * @for_alloc: %true if the pool is used for memory allocation. + * Here it is set to %false, to force devices to use direct dma operations. + * + * @force_bounce: %true if swiotlb bouncing is forced. + * Here it is set to %false, to force devices to use direct dma operations. + */ +static struct io_tlb_mem io_tlb_no_swiotlb_mem = { + .for_alloc = false, + .force_bounce = false, +}; + +void enable_swiotlb_for_cvm_dev(struct device *dev, bool enable) +{ + if (!is_virtcca_cvm_world()) + return; + + if (enable) + swiotlb_dev_init(dev); + else + dev->dma_io_tlb_mem = &io_tlb_no_swiotlb_mem; +} +EXPORT_SYMBOL_GPL(enable_swiotlb_for_cvm_dev); + +void swiotlb_unmap_notify(unsigned long paddr, unsigned long size) +{ + struct arm_smccc_res res; + + arm_smccc_1_1_smc(SMC_TSI_SEC_MEM_UNMAP, paddr, size, &res); +}
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kvm/arm.c
Changed
@@ -449,12 +449,6 @@ vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO; - /* - * Default value for the FP state, will be overloaded at load - * time if we support FP (pretty likely) - */ - vcpu->arch.fp_state = FP_STATE_FREE; - /* Set up the timer */ kvm_timer_vcpu_init(vcpu); @@ -2078,7 +2072,7 @@ static void cpu_hyp_init_context(void) { - kvm_init_host_cpu_context(&this_cpu_ptr_hyp_sym(kvm_host_data)->host_ctxt); + kvm_init_host_cpu_context(host_data_ptr(host_ctxt)); if (!is_kernel_in_hyp_mode()) cpu_init_hyp_mode();
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kvm/debug.c
Changed
@@ -335,10 +335,15 @@ if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_TraceBuffer_SHIFT) && !(read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_EL1_P)) vcpu_set_flag(vcpu, DEBUG_STATE_SAVE_TRBE); + + /* Check if we have BRBE implemented and available at the host */ + if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_BRBE_SHIFT)) + vcpu_set_flag(vcpu, DEBUG_STATE_SAVE_BRBE); } void kvm_arch_vcpu_put_debug_state_flags(struct kvm_vcpu *vcpu) { vcpu_clear_flag(vcpu, DEBUG_STATE_SAVE_SPE); vcpu_clear_flag(vcpu, DEBUG_STATE_SAVE_TRBE); + vcpu_clear_flag(vcpu, DEBUG_STATE_SAVE_BRBE); }
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kvm/fpsimd.c
Changed
@@ -49,8 +49,6 @@ if (ret) return ret; - vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd); - /* * We need to keep current's task_struct pinned until its data has been * unshared with the hypervisor to make sure it is not re-used by the @@ -86,7 +84,8 @@ * guest in kvm_arch_vcpu_ctxflush_fp() and override this to * FP_STATE_FREE if the flag set. */ - vcpu->arch.fp_state = FP_STATE_HOST_OWNED; + *host_data_ptr(fp_owner) = FP_STATE_HOST_OWNED; + *host_data_ptr(fpsimd_state) = kern_hyp_va(¤t->thread.uw.fpsimd_state); vcpu_clear_flag(vcpu, HOST_SVE_ENABLED); if (read_sysreg(cpacr_el1) & CPACR_EL1_ZEN_EL0EN) @@ -110,7 +109,7 @@ * been saved, this is very unlikely to happen. */ if (read_sysreg_s(SYS_SVCR) & (SVCR_SM_MASK | SVCR_ZA_MASK)) { - vcpu->arch.fp_state = FP_STATE_FREE; + *host_data_ptr(fp_owner) = FP_STATE_FREE; fpsimd_save_and_flush_cpu_state(); } } @@ -126,7 +125,7 @@ void kvm_arch_vcpu_ctxflush_fp(struct kvm_vcpu *vcpu) { if (test_thread_flag(TIF_FOREIGN_FPSTATE)) - vcpu->arch.fp_state = FP_STATE_FREE; + *host_data_ptr(fp_owner) = FP_STATE_FREE; } /* @@ -142,7 +141,7 @@ WARN_ON_ONCE(!irqs_disabled()); - if (vcpu->arch.fp_state == FP_STATE_GUEST_OWNED) { + if (*host_data_ptr(fp_owner) == FP_STATE_GUEST_OWNED) { /* * Currently we do not support SME guests so SVCR is @@ -195,7 +194,7 @@ isb(); } - if (vcpu->arch.fp_state == FP_STATE_GUEST_OWNED) { + if (*host_data_ptr(fp_owner) == FP_STATE_GUEST_OWNED) { if (vcpu_has_sve(vcpu)) { __vcpu_sys_reg(vcpu, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kvm/hyp/include/hyp/debug-sr.h
Changed
@@ -135,9 +135,9 @@ if (!vcpu_get_flag(vcpu, DEBUG_DIRTY)) return; - host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; + host_ctxt = host_data_ptr(host_ctxt); guest_ctxt = &vcpu->arch.ctxt; - host_dbg = &vcpu->arch.host_debug_state.regs; + host_dbg = host_data_ptr(host_debug_state.regs); guest_dbg = kern_hyp_va(vcpu->arch.debug_ptr); __debug_save_state(host_dbg, host_ctxt); @@ -154,9 +154,9 @@ if (!vcpu_get_flag(vcpu, DEBUG_DIRTY)) return; - host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; + host_ctxt = host_data_ptr(host_ctxt); guest_ctxt = &vcpu->arch.ctxt; - host_dbg = &vcpu->arch.host_debug_state.regs; + host_dbg = host_data_ptr(host_debug_state.regs); guest_dbg = kern_hyp_va(vcpu->arch.debug_ptr); __debug_save_state(guest_dbg, guest_ctxt);
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kvm/hyp/include/hyp/switch.h
Changed
@@ -42,7 +42,7 @@ /* Check whether the FP regs are owned by the guest */ static inline bool guest_owns_fp_regs(struct kvm_vcpu *vcpu) { - return vcpu->arch.fp_state == FP_STATE_GUEST_OWNED; + return *host_data_ptr(fp_owner) == FP_STATE_GUEST_OWNED; } /* Save the 32-bit only FPSIMD system register state */ @@ -82,7 +82,7 @@ static inline void __activate_traps_hfgxtr(struct kvm_vcpu *vcpu) { - struct kvm_cpu_context *hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; + struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt); u64 r_clr = 0, w_clr = 0, r_set = 0, w_set = 0, tmp; u64 r_val, w_val; @@ -157,7 +157,7 @@ static inline void __deactivate_traps_hfgxtr(struct kvm_vcpu *vcpu) { - struct kvm_cpu_context *hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; + struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt); if (!cpus_have_final_cap(ARM64_HAS_FGT)) return; @@ -218,7 +218,7 @@ write_sysreg(0, pmselr_el0); - hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; + hctxt = host_data_ptr(host_ctxt); ctxt_sys_reg(hctxt, PMUSERENR_EL0) = read_sysreg(pmuserenr_el0); write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0); vcpu_set_flag(vcpu, PMUSERENR_ON_CPU); @@ -228,7 +228,7 @@ !kern_hyp_va(vcpu->kvm)->arch.pfr1_nmi) sysreg_clear_set_s(SYS_HCRX_EL2, 0, HCRX_EL2_TALLINT); - vcpu->arch.mdcr_el2_host = read_sysreg(mdcr_el2); + *host_data_ptr(host_debug_state.mdcr_el2) = read_sysreg(mdcr_el2); write_sysreg(vcpu->arch.mdcr_el2, mdcr_el2); if (cpus_have_final_cap(ARM64_HAS_HCX)) { @@ -251,7 +251,7 @@ static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu) { - write_sysreg(vcpu->arch.mdcr_el2_host, mdcr_el2); + write_sysreg(*host_data_ptr(host_debug_state.mdcr_el2), mdcr_el2); if (cpus_have_final_cap(ARM64_HAS_NMI) && !kern_hyp_va(vcpu->kvm)->arch.pfr1_nmi) @@ -261,7 +261,7 @@ if (kvm_arm_support_pmu_v3()) { struct kvm_cpu_context *hctxt; - hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; + hctxt = host_data_ptr(host_ctxt); write_sysreg(ctxt_sys_reg(hctxt, PMUSERENR_EL0), pmuserenr_el0); vcpu_clear_flag(vcpu, PMUSERENR_ON_CPU); } @@ -364,8 +364,8 @@ isb(); /* Write out the host state if it's in the registers */ - if (vcpu->arch.fp_state == FP_STATE_HOST_OWNED) - __fpsimd_save_state(vcpu->arch.host_fpsimd_state); + if (*host_data_ptr(fp_owner) == FP_STATE_HOST_OWNED) + __fpsimd_save_state(*host_data_ptr(fpsimd_state)); /* Restore the guest state */ if (sve_guest) @@ -377,7 +377,7 @@ if (!(read_sysreg(hcr_el2) & HCR_RW)) write_sysreg(__vcpu_sys_reg(vcpu, FPEXC32_EL2), fpexc32_el2); - vcpu->arch.fp_state = FP_STATE_GUEST_OWNED; + *host_data_ptr(fp_owner) = FP_STATE_GUEST_OWNED; return true; }
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
Changed
@@ -54,6 +54,7 @@ } void pkvm_hyp_vm_table_init(void *tbl); +void pkvm_host_fpsimd_state_init(void); int __pkvm_init_vm(struct kvm *host_kvm, unsigned long vm_hva, unsigned long pgd_hva);
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kvm/hyp/nvhe/debug-sr.c
Changed
@@ -31,8 +31,8 @@ return; /* Yes; save the control register and disable data generation */ - *pmscr_el1 = read_sysreg_s(SYS_PMSCR_EL1); - write_sysreg_s(0, SYS_PMSCR_EL1); + *pmscr_el1 = read_sysreg_el1(SYS_PMSCR); + write_sysreg_el1(0, SYS_PMSCR); isb(); /* Now drain all buffered data to memory */ @@ -48,7 +48,7 @@ isb(); /* Re-enable data generation */ - write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1); + write_sysreg_el1(pmscr_el1, SYS_PMSCR); } static void __debug_save_trace(u64 *trfcr_el1) @@ -63,8 +63,8 @@ * Since access to TRFCR_EL1 is trapped, the guest can't * modify the filtering set by the host. */ - *trfcr_el1 = read_sysreg_s(SYS_TRFCR_EL1); - write_sysreg_s(0, SYS_TRFCR_EL1); + *trfcr_el1 = read_sysreg_el1(SYS_TRFCR); + write_sysreg_el1(0, SYS_TRFCR); isb(); /* Drain the trace buffer to memory */ tsb_csync(); @@ -76,17 +76,46 @@ return; /* Restore trace filter controls */ - write_sysreg_s(trfcr_el1, SYS_TRFCR_EL1); + write_sysreg_el1(trfcr_el1, SYS_TRFCR); +} + +static void __debug_save_brbe(u64 *brbcr_el1) +{ + *brbcr_el1 = 0; + + /* Check if the BRBE is enabled */ + if (!(read_sysreg_el1(SYS_BRBCR) & (BRBCR_ELx_E0BRE | BRBCR_ELx_ExBRE))) + return; + + /* + * Prohibit branch record generation while we are in guest. + * Since access to BRBCR_EL1 is trapped, the guest can't + * modify the filtering set by the host. + */ + *brbcr_el1 = read_sysreg_el1(SYS_BRBCR); + write_sysreg_el1(0, SYS_BRBCR); +} + +static void __debug_restore_brbe(u64 brbcr_el1) +{ + if (!brbcr_el1) + return; + + /* Restore BRBE controls */ + write_sysreg_el1(brbcr_el1, SYS_BRBCR); } void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu) { /* Disable and flush SPE data generation */ if (vcpu_get_flag(vcpu, DEBUG_STATE_SAVE_SPE)) - __debug_save_spe(&vcpu->arch.host_debug_state.pmscr_el1); + __debug_save_spe(host_data_ptr(host_debug_state.pmscr_el1)); /* Disable and flush Self-Hosted Trace generation */ if (vcpu_get_flag(vcpu, DEBUG_STATE_SAVE_TRBE)) - __debug_save_trace(&vcpu->arch.host_debug_state.trfcr_el1); + __debug_save_trace(host_data_ptr(host_debug_state.trfcr_el1)); + /* Disable BRBE branch records */ + if (vcpu_get_flag(vcpu, DEBUG_STATE_SAVE_BRBE)) + __debug_save_brbe(host_data_ptr(host_debug_state.brbcr_el1)); } void __debug_switch_to_guest(struct kvm_vcpu *vcpu) @@ -97,9 +126,11 @@ void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu) { if (vcpu_get_flag(vcpu, DEBUG_STATE_SAVE_SPE)) - __debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1); + __debug_restore_spe(*host_data_ptr(host_debug_state.pmscr_el1)); if (vcpu_get_flag(vcpu, DEBUG_STATE_SAVE_TRBE)) - __debug_restore_trace(vcpu->arch.host_debug_state.trfcr_el1); + __debug_restore_trace(*host_data_ptr(host_debug_state.trfcr_el1)); + if (vcpu_get_flag(vcpu, DEBUG_STATE_SAVE_BRBE)) + __debug_restore_brbe(*host_data_ptr(host_debug_state.brbcr_el1)); } void __debug_switch_to_host(struct kvm_vcpu *vcpu)
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kvm/hyp/nvhe/hyp-main.c
Changed
@@ -39,10 +39,8 @@ hyp_vcpu->vcpu.arch.cptr_el2 = host_vcpu->arch.cptr_el2; hyp_vcpu->vcpu.arch.iflags = host_vcpu->arch.iflags; - hyp_vcpu->vcpu.arch.fp_state = host_vcpu->arch.fp_state; hyp_vcpu->vcpu.arch.debug_ptr = kern_hyp_va(host_vcpu->arch.debug_ptr); - hyp_vcpu->vcpu.arch.host_fpsimd_state = host_vcpu->arch.host_fpsimd_state; hyp_vcpu->vcpu.arch.vsesr_el2 = host_vcpu->arch.vsesr_el2; @@ -64,7 +62,6 @@ host_vcpu->arch.fault = hyp_vcpu->vcpu.arch.fault; host_vcpu->arch.iflags = hyp_vcpu->vcpu.arch.iflags; - host_vcpu->arch.fp_state = hyp_vcpu->vcpu.arch.fp_state; host_cpu_if->vgic_hcr = hyp_cpu_if->vgic_hcr; for (i = 0; i < hyp_cpu_if->used_lrs; ++i)
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kvm/hyp/nvhe/pkvm.c
Changed
@@ -243,6 +243,17 @@ vm_table = tbl; } +void pkvm_host_fpsimd_state_init(void) +{ + unsigned long i; + + for (i = 0; i < hyp_nr_cpus; i++) { + struct kvm_host_data *host_data = per_cpu_ptr(&kvm_host_data, i); + + host_data->fpsimd_state = &host_data->host_ctxt.fp_regs; + } +} + /* * Return the hyp vm structure corresponding to the handle. */
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kvm/hyp/nvhe/psci-relay.c
Changed
@@ -205,7 +205,7 @@ struct psci_boot_args *boot_args; struct kvm_cpu_context *host_ctxt; - host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; + host_ctxt = host_data_ptr(host_ctxt); if (is_cpu_on) boot_args = this_cpu_ptr(&cpu_on_args);
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kvm/hyp/nvhe/setup.c
Changed
@@ -257,8 +257,7 @@ void __noreturn __pkvm_init_finalise(void) { - struct kvm_host_data *host_data = this_cpu_ptr(&kvm_host_data); - struct kvm_cpu_context *host_ctxt = &host_data->host_ctxt; + struct kvm_cpu_context *host_ctxt = host_data_ptr(host_ctxt); unsigned long nr_pages, reserved_pages, pfn; int ret; @@ -301,6 +300,7 @@ goto out; pkvm_hyp_vm_table_init(vm_table_base); + pkvm_host_fpsimd_state_init(); out: /* * We tail-called to here from handle___pkvm_init() and will not return,
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kvm/hyp/nvhe/switch.c
Changed
@@ -271,7 +271,7 @@ pmr_sync(); } - host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; + host_ctxt = host_data_ptr(host_ctxt); host_ctxt->__hyp_running_vcpu = vcpu; guest_ctxt = &vcpu->arch.ctxt; @@ -346,7 +346,7 @@ __sysreg_restore_state_nvhe(host_ctxt); - if (vcpu->arch.fp_state == FP_STATE_GUEST_OWNED) + if (*host_data_ptr(fp_owner) == FP_STATE_GUEST_OWNED) __fpsimd_save_fpexc32(vcpu); __debug_switch_to_host(vcpu); @@ -376,7 +376,7 @@ struct kvm_cpu_context *host_ctxt; struct kvm_vcpu *vcpu; - host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; + host_ctxt = host_data_ptr(host_ctxt); vcpu = host_ctxt->__hyp_running_vcpu; if (vcpu) {
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kvm/hyp/vhe/switch.c
Changed
@@ -183,7 +183,7 @@ * If we were in HYP context on entry, adjust the PSTATE view * so that the usual helpers work correctly. */ - if (unlikely(vcpu_get_flag(vcpu, VCPU_HYP_CONTEXT))) { + if (vcpu_has_nv(vcpu) && (read_sysreg(hcr_el2) & HCR_NV)) { u64 mode = *vcpu_cpsr(vcpu) & (PSR_MODE_MASK | PSR_MODE32_BIT); switch (mode) { @@ -207,7 +207,7 @@ struct kvm_cpu_context *guest_ctxt; u64 exit_code; - host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; + host_ctxt = host_data_ptr(host_ctxt); host_ctxt->__hyp_running_vcpu = vcpu; guest_ctxt = &vcpu->arch.ctxt; @@ -232,11 +232,6 @@ sysreg_restore_guest_state_vhe(guest_ctxt); __debug_switch_to_guest(vcpu); - if (is_hyp_ctxt(vcpu)) - vcpu_set_flag(vcpu, VCPU_HYP_CONTEXT); - else - vcpu_clear_flag(vcpu, VCPU_HYP_CONTEXT); - do { /* Jump in the fire! */ exit_code = __guest_enter(vcpu); @@ -250,7 +245,7 @@ sysreg_restore_host_state_vhe(host_ctxt); - if (vcpu->arch.fp_state == FP_STATE_GUEST_OWNED) + if (*host_data_ptr(fp_owner) == FP_STATE_GUEST_OWNED) __fpsimd_save_fpexc32(vcpu); __debug_switch_to_host(vcpu); @@ -298,7 +293,7 @@ struct kvm_cpu_context *host_ctxt; struct kvm_vcpu *vcpu; - host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; + host_ctxt = host_data_ptr(host_ctxt); vcpu = host_ctxt->__hyp_running_vcpu; __deactivate_traps(vcpu);
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kvm/hyp/vhe/sysreg-sr.c
Changed
@@ -67,7 +67,7 @@ struct kvm_cpu_context *guest_ctxt = &vcpu->arch.ctxt; struct kvm_cpu_context *host_ctxt; - host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; + host_ctxt = host_data_ptr(host_ctxt); __sysreg_save_user_state(host_ctxt); /* @@ -113,7 +113,7 @@ struct kvm_cpu_context *guest_ctxt = &vcpu->arch.ctxt; struct kvm_cpu_context *host_ctxt; - host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; + host_ctxt = host_data_ptr(host_ctxt); deactivate_traps_vhe_put(vcpu); __sysreg_save_el1_state(guest_ctxt);
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kvm/pmu.c
Changed
@@ -232,7 +232,7 @@ if (!vcpu || !vcpu_get_flag(vcpu, PMUSERENR_ON_CPU)) return false; - hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; + hctxt = host_data_ptr(host_ctxt); ctxt_sys_reg(hctxt, PMUSERENR_EL0) = val; return true; }
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kvm/sys_regs.c
Changed
@@ -1129,6 +1129,11 @@ return true; } +#define BRB_INF_SRC_TGT_EL1(n) \ + { SYS_DESC(SYS_BRBINF_EL1(n)), undef_access }, \ + { SYS_DESC(SYS_BRBSRC_EL1(n)), undef_access }, \ + { SYS_DESC(SYS_BRBTGT_EL1(n)), undef_access } \ + /* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */ #define DBG_BCR_BVR_WCR_WVR_EL1(n) \ { SYS_DESC(SYS_DBGBVRn_EL1(n)), \ @@ -1517,6 +1522,9 @@ /* Hide SPE from guests */ val &= ~ID_AA64DFR0_EL1_PMSVer_MASK; + /* Hide BRBE from guests */ + val &= ~ID_AA64DFR0_EL1_BRBE_MASK; + return val; } @@ -1979,6 +1987,8 @@ { SYS_DESC(SYS_DC_CISW), access_dcsw }, { SYS_DESC(SYS_DC_CIGSW), access_dcgsw }, { SYS_DESC(SYS_DC_CIGDSW), access_dcgsw }, + { SYS_DESC(OP_BRB_IALL), undef_access }, + { SYS_DESC(OP_BRB_INJ), undef_access }, DBG_BCR_BVR_WCR_WVR_EL1(0), DBG_BCR_BVR_WCR_WVR_EL1(1), @@ -2009,6 +2019,52 @@ { SYS_DESC(SYS_DBGCLAIMCLR_EL1), trap_raz_wi }, { SYS_DESC(SYS_DBGAUTHSTATUS_EL1), trap_dbgauthstatus_el1 }, + /* + * BRBE branch record sysreg address space is interleaved between + * corresponding BRBINF<N>_EL1, BRBSRC<N>_EL1, and BRBTGT<N>_EL1. + */ + BRB_INF_SRC_TGT_EL1(0), + BRB_INF_SRC_TGT_EL1(16), + BRB_INF_SRC_TGT_EL1(1), + BRB_INF_SRC_TGT_EL1(17), + BRB_INF_SRC_TGT_EL1(2), + BRB_INF_SRC_TGT_EL1(18), + BRB_INF_SRC_TGT_EL1(3), + BRB_INF_SRC_TGT_EL1(19), + BRB_INF_SRC_TGT_EL1(4), + BRB_INF_SRC_TGT_EL1(20), + BRB_INF_SRC_TGT_EL1(5), + BRB_INF_SRC_TGT_EL1(21), + BRB_INF_SRC_TGT_EL1(6), + BRB_INF_SRC_TGT_EL1(22), + BRB_INF_SRC_TGT_EL1(7), + BRB_INF_SRC_TGT_EL1(23), + BRB_INF_SRC_TGT_EL1(8), + BRB_INF_SRC_TGT_EL1(24), + BRB_INF_SRC_TGT_EL1(9), + BRB_INF_SRC_TGT_EL1(25), + BRB_INF_SRC_TGT_EL1(10), + BRB_INF_SRC_TGT_EL1(26), + BRB_INF_SRC_TGT_EL1(11), + BRB_INF_SRC_TGT_EL1(27), + BRB_INF_SRC_TGT_EL1(12), + BRB_INF_SRC_TGT_EL1(28), + BRB_INF_SRC_TGT_EL1(13), + BRB_INF_SRC_TGT_EL1(29), + BRB_INF_SRC_TGT_EL1(14), + BRB_INF_SRC_TGT_EL1(30), + BRB_INF_SRC_TGT_EL1(15), + BRB_INF_SRC_TGT_EL1(31), + + /* Remaining BRBE sysreg addresses space */ + { SYS_DESC(SYS_BRBCR_EL1), undef_access }, + { SYS_DESC(SYS_BRBFCR_EL1), undef_access }, + { SYS_DESC(SYS_BRBTS_EL1), undef_access }, + { SYS_DESC(SYS_BRBINFINJ_EL1), undef_access }, + { SYS_DESC(SYS_BRBSRCINJ_EL1), undef_access }, + { SYS_DESC(SYS_BRBTGTINJ_EL1), undef_access }, + { SYS_DESC(SYS_BRBIDR0_EL1), undef_access }, + { SYS_DESC(SYS_MDCCSR_EL0), trap_raz_wi }, { SYS_DESC(SYS_DBGDTR_EL0), trap_raz_wi }, // DBGDTRTRX_EL0 share the same encoding
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/kvm/virtcca_cvm.c
Changed
@@ -493,7 +493,7 @@ params->kae_vf_num = cfg->kae_vf_num; memcpy(params->sec_addr, cfg->sec_addr, cfg->kae_vf_num * sizeof(u64)); - memcpy(params->hpre_addr, cfg->sec_addr, cfg->kae_vf_num * sizeof(u64)); + memcpy(params->hpre_addr, cfg->hpre_addr, cfg->kae_vf_num * sizeof(u64)); return 0; }
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/arm64/tools/sysreg
Changed
@@ -1002,6 +1002,137 @@ EndEnum EndSysreg + +SysregFields BRBINFx_EL1 +Res0 63:47 +Field 46 CCU +Field 45:32 CC +Res0 31:18 +Field 17 LASTFAILED +Field 16 T +Res0 15:14 +Enum 13:8 TYPE + 0b000000 DIRECT_UNCOND + 0b000001 INDIRECT + 0b000010 DIRECT_LINK + 0b000011 INDIRECT_LINK + 0b000101 RET + 0b000111 ERET + 0b001000 DIRECT_COND + 0b100001 DEBUG_HALT + 0b100010 CALL + 0b100011 TRAP + 0b100100 SERROR + 0b100110 INSN_DEBUG + 0b100111 DATA_DEBUG + 0b101010 ALIGN_FAULT + 0b101011 INSN_FAULT + 0b101100 DATA_FAULT + 0b101110 IRQ + 0b101111 FIQ + 0b110000 IMPDEF_TRAP_EL3 + 0b111001 DEBUG_EXIT +EndEnum +Enum 7:6 EL + 0b00 EL0 + 0b01 EL1 + 0b10 EL2 + 0b11 EL3 +EndEnum +Field 5 MPRED +Res0 4:2 +Enum 1:0 VALID + 0b00 NONE + 0b01 TARGET + 0b10 SOURCE + 0b11 FULL +EndEnum +EndSysregFields + +SysregFields BRBCR_ELx +Res0 63:24 +Field 23 EXCEPTION +Field 22 ERTN +Res0 21:10 +Field 9 FZPSS +Field 8 FZP +Res0 7 +Enum 6:5 TS + 0b01 VIRTUAL + 0b10 GUEST_PHYSICAL + 0b11 PHYSICAL +EndEnum +Field 4 MPRED +Field 3 CC +Res0 2 +Field 1 ExBRE +Field 0 E0BRE +EndSysregFields + +Sysreg BRBCR_EL1 2 1 9 0 0 +Fields BRBCR_ELx +EndSysreg + +Sysreg BRBFCR_EL1 2 1 9 0 1 +Res0 63:30 +Enum 29:28 BANK + 0b00 BANK_0 + 0b01 BANK_1 +EndEnum +Res0 27:23 +Field 22 CONDDIR +Field 21 DIRCALL +Field 20 INDCALL +Field 19 RTN +Field 18 INDIRECT +Field 17 DIRECT +Field 16 EnI +Res0 15:8 +Field 7 PAUSED +Field 6 LASTFAILED +Res0 5:0 +EndSysreg + +Sysreg BRBTS_EL1 2 1 9 0 2 +Field 63:0 TS +EndSysreg + +Sysreg BRBINFINJ_EL1 2 1 9 1 0 +Fields BRBINFx_EL1 +EndSysreg + +Sysreg BRBSRCINJ_EL1 2 1 9 1 1 +Field 63:0 ADDRESS +EndSysreg + +Sysreg BRBTGTINJ_EL1 2 1 9 1 2 +Field 63:0 ADDRESS +EndSysreg + +Sysreg BRBIDR0_EL1 2 1 9 2 0 +Res0 63:16 +Enum 15:12 CC + 0b0101 20_BIT +EndEnum +Enum 11:8 FORMAT + 0b0000 FORMAT_0 +EndEnum +Enum 7:0 NUMREC + 0b00001000 8 + 0b00010000 16 + 0b00100000 32 + 0b01000000 64 +EndEnum +EndSysreg + +Sysreg BRBCR_EL2 2 4 9 0 0 +Fields BRBCR_ELx +EndSysreg + +Sysreg BRBCR_EL12 2 5 9 0 0 +Fields BRBCR_ELx +EndSysreg + Sysreg ID_AA64ZFR0_EL1 3 0 0 4 4 Res0 63:60 UnsignedEnum 59:56 F64MM
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/loongarch/kvm/timer.c
Changed
@@ -161,10 +161,11 @@ if (kvm_vcpu_is_blocking(vcpu)) { /* - * HRTIMER_MODE_PINNED is suggested since vcpu may run in - * the same physical cpu in next time + * HRTIMER_MODE_PINNED_HARD is suggested since vcpu may run in + * the same physical cpu in next time, and the timer should run + * in hardirq context even in the PREEMPT_RT case. */ - hrtimer_start(&vcpu->arch.swtimer, expire, HRTIMER_MODE_ABS_PINNED); + hrtimer_start(&vcpu->arch.swtimer, expire, HRTIMER_MODE_ABS_PINNED_HARD); } }
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/loongarch/kvm/vcpu.c
Changed
@@ -1283,7 +1283,7 @@ vcpu->arch.vpid = 0; - hrtimer_init(&vcpu->arch.swtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED); + hrtimer_init(&vcpu->arch.swtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED_HARD); vcpu->arch.swtimer.function = kvm_swtimer_wakeup; vcpu->arch.handle_exit = kvm_handle_exit;
View file
_service:recompress:tar_scm:kernel.tar.gz/arch/x86/configs/openeuler_defconfig
Changed
@@ -1219,6 +1219,7 @@ CONFIG_NET_XGRESS=y CONFIG_NET_REDIRECT=y CONFIG_SKB_EXTENSIONS=y +# CONFIG_ETH_CAQM is not set # # Networking options
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/acpi/arm64/iort.c
Changed
@@ -1712,6 +1712,8 @@ /* HiSilicon Hip09 Platform */ {"HISI ", "HIP09 ", 0, ACPI_SIG_IORT, greater_than_or_equal, "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09}, + {"HISI ", "HIP09A ", 0, ACPI_SIG_IORT, greater_than_or_equal, + "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09}, /* HiSilicon Hip10/11 Platform uses the same SMMU IP with Hip09 */ {"HISI ", "HIP10 ", 0, ACPI_SIG_IORT, greater_than_or_equal, "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09},
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/base/core.c
Changed
@@ -31,6 +31,7 @@ #include <linux/swiotlb.h> #include <linux/sysfs.h> #include <linux/dma-map-ops.h> /* for dma_default_coherent */ +#include <linux/virtcca_cvm_domain.h> #include "base.h" #include "physical_location.h" @@ -3134,6 +3135,7 @@ dev->dma_coherent = dma_default_coherent; #endif swiotlb_dev_init(dev); + enable_swiotlb_for_cvm_dev(dev, false); } EXPORT_SYMBOL_GPL(device_initialize);
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/char/tpm/tpm-chip.c
Changed
@@ -519,10 +519,6 @@ { struct tpm_chip *chip = container_of(rng, struct tpm_chip, hwrng); - /* Give back zero bytes, as TPM chip has not yet fully resumed: */ - if (chip->flags & TPM_CHIP_FLAG_SUSPENDED) - return 0; - return tpm_get_random(chip, data, max); }
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/char/tpm/tpm-interface.c
Changed
@@ -394,6 +394,13 @@ if (!chip) return -ENODEV; + rc = tpm_try_get_ops(chip); + if (rc) { + /* Can be safely set out of locks, as no action cannot race: */ + chip->flags |= TPM_CHIP_FLAG_SUSPENDED; + goto out; + } + if (chip->flags & TPM_CHIP_FLAG_ALWAYS_POWERED) goto suspended; @@ -401,19 +408,18 @@ !pm_suspend_via_firmware()) goto suspended; - rc = tpm_try_get_ops(chip); - if (!rc) { - if (chip->flags & TPM_CHIP_FLAG_TPM2) - tpm2_shutdown(chip, TPM2_SU_STATE); - else - rc = tpm1_pm_suspend(chip, tpm_suspend_pcr); - - tpm_put_ops(chip); + if (chip->flags & TPM_CHIP_FLAG_TPM2) { + tpm2_shutdown(chip, TPM2_SU_STATE); + goto suspended; } + rc = tpm1_pm_suspend(chip, tpm_suspend_pcr); + suspended: chip->flags |= TPM_CHIP_FLAG_SUSPENDED; + tpm_put_ops(chip); +out: if (rc) dev_err(dev, "Ignoring error %d while suspending\n", rc); return 0; @@ -462,11 +468,18 @@ if (!chip) return -ENODEV; + /* Give back zero bytes, as TPM chip has not yet fully resumed: */ + if (chip->flags & TPM_CHIP_FLAG_SUSPENDED) { + rc = 0; + goto out; + } + if (chip->flags & TPM_CHIP_FLAG_TPM2) rc = tpm2_get_random(chip, out, max); else rc = tpm1_get_random(chip, out, max); +out: tpm_put_ops(chip); return rc; }
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/hwmon/k10temp.c
Changed
@@ -159,8 +159,9 @@ static void read_tempreg_nb_zen(struct pci_dev *pdev, u32 *regval) { - amd_smn_read(amd_pci_dev_to_node_id(pdev), - ZEN_REPORTED_TEMP_CTRL_BASE, regval); + if (amd_smn_read(amd_pci_dev_to_node_id(pdev), + ZEN_REPORTED_TEMP_CTRL_BASE, regval)) + *regval = 0; } static long get_raw_temp(struct k10temp_data *data) @@ -228,6 +229,7 @@ long *val) { struct k10temp_data *data = dev_get_drvdata(dev); + int ret = -EOPNOTSUPP; u32 regval; switch (attr) { @@ -246,14 +248,18 @@ case 2 ... 13: /* Tccd{1-12} */ if (hygon_f18h_m4h()) hygon_read_temp(data, channel, ®val); - else - amd_smn_read(amd_pci_dev_to_node_id(data->pdev), - ZEN_CCD_TEMP(data->ccd_offset, channel - 2), - ®val); + else { + ret = amd_smn_read(amd_pci_dev_to_node_id(data->pdev), + ZEN_CCD_TEMP(data->ccd_offset, channel - 2), + ®val); + + if (ret) + return ret; + } *val = (regval & ZEN_CCD_TEMP_MASK) * 125 - 49000; break; default: - return -EOPNOTSUPP; + return ret; } break; case hwmon_temp_max: @@ -269,7 +275,7 @@ - ((regval >> 24) & 0xf)) * 500 + 52000; break; default: - return -EOPNOTSUPP; + return ret; } return 0; } @@ -407,8 +413,20 @@ int i; for (i = 0; i < limit; i++) { - amd_smn_read(amd_pci_dev_to_node_id(pdev), - ZEN_CCD_TEMP(data->ccd_offset, i), ®val); + /* + * Ignore inaccessible CCDs. + * + * Some systems will return a register value of 0, and the TEMP_VALID + * bit check below will naturally fail. + * + * Other systems will return a PCI_ERROR_RESPONSE (0xFFFFFFFF) for + * the register value. And this will incorrectly pass the TEMP_VALID + * bit check. + */ + if (amd_smn_read(amd_pci_dev_to_node_id(pdev), + ZEN_CCD_TEMP(data->ccd_offset, i), ®val)) + continue; + if (regval & ZEN_CCD_TEMP_VALID) data->show_temp |= BIT(TCCD_BIT(i)); }
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/irqchip/irq-gic-v3-its.c
Changed
@@ -32,6 +32,7 @@ #ifdef CONFIG_HISI_VIRTCCA_GUEST #include <linux/swiotlb.h> #include <asm/virtcca_cvm_guest.h> +#include <linux/virtcca_cvm_domain.h> #endif #include <linux/irqchip.h> @@ -1106,8 +1107,8 @@ its_encode_valid(cmd, desc->its_vmapp_cmd.valid); if (!desc->its_vmapp_cmd.valid) { + alloc = !atomic_dec_return(&desc->its_vmapp_cmd.vpe->vmapp_count); if (is_v4_1(its)) { - alloc = !atomic_dec_return(&desc->its_vmapp_cmd.vpe->vmapp_count); its_encode_alloc(cmd, alloc); /* * Unmapping a VPE is self-synchronizing on GICv4.1, @@ -1128,13 +1129,13 @@ its_encode_vpt_addr(cmd, vpt_addr); its_encode_vpt_size(cmd, LPI_NRBITS - 1); + alloc = !atomic_fetch_inc(&desc->its_vmapp_cmd.vpe->vmapp_count); + if (!is_v4_1(its)) goto out; vconf_addr = virt_to_phys(page_address(desc->its_vmapp_cmd.vpe->its_vm->vprop_page)); - alloc = !atomic_fetch_inc(&desc->its_vmapp_cmd.vpe->vmapp_count); - its_encode_alloc(cmd, alloc); /* @@ -4252,6 +4253,23 @@ int from, cpu; /* + * Check if we're racing against a VPE being destroyed, for + * which we don't want to allow a VMOVP. + */ + if (!atomic_read(&vpe->vmapp_count)) { + if (gic_requires_eager_mapping()) + return -EINVAL; + + /* + * If we lazily map the VPEs, this isn't an error and + * we can exit cleanly. + */ + cpu = cpumask_first(mask_val); + irq_data_update_effective_affinity(d, cpumask_of(cpu)); + return IRQ_SET_MASK_OK_DONE; + } + + /* * Changing affinity is mega expensive, so let's be as lazy as * we can and only do it if we really have to. Also, if mapped * into the proxy device, we need to move the doorbell @@ -4956,9 +4974,8 @@ raw_spin_lock_init(&vpe->vpe_lock); vpe->vpe_id = vpe_id; vpe->vpt_page = vpt_page; - if (gic_rdists->has_rvpeid) - atomic_set(&vpe->vmapp_count, 0); - else + atomic_set(&vpe->vmapp_count, 0); + if (!gic_rdists->has_rvpeid) vpe->vpe_proxy_event = -1; return 0; @@ -6191,6 +6208,7 @@ #ifdef CONFIG_HISI_VIRTCCA_GUEST if (is_virtcca_cvm_world()) { device_initialize(&cvm_alloc_device); + enable_swiotlb_for_cvm_dev(&cvm_alloc_device, true); raw_spin_lock_init(&cvm_its_lock); } #endif
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
Changed
@@ -1097,6 +1097,8 @@ *pos += scnprintf(buf + *pos, len - *pos, "TX timeout threshold: %d seconds\n", dev->watchdog_timeo / HZ); + *pos += scnprintf(buf + *pos, len - *pos, "mac tunnel number: %u\n", + dev_specs->tnl_num); *pos += scnprintf(buf + *pos, len - *pos, "Hilink Version: %u\n", dev_specs->hilink_version); }
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/net/ethernet/huawei/hinic3/hinic3_main.c
Changed
@@ -393,6 +393,14 @@ static void hinic3_tx_rx_ops_init(struct hinic3_nic_dev *nic_dev) { + if (HINIC3_SUPPORT_TX_COMPACT_WQE_OL(nic_dev->hwdev)) { + nic_dev->tx_rx_ops.tx_set_wqebb_cnt = hinic3_tx_set_compact_offload_wqebb_cnt; + nic_dev->tx_rx_ops.tx_set_wqe_task = hinic3_tx_set_compact_offload_wqe_task; + } else { + nic_dev->tx_rx_ops.tx_set_wqebb_cnt = hinic3_tx_set_wqebb_cnt; + nic_dev->tx_rx_ops.tx_set_wqe_task = hinic3_tx_set_wqe_task; + } + if (HINIC3_SUPPORT_RX_COMPACT_CQE(nic_dev->hwdev)) nic_dev->tx_rx_ops.rx_get_cqe_info = hinic3_rx_get_compact_cqe_info; else
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/net/ethernet/huawei/hinic3/hinic3_mgmt_interface.h
Changed
@@ -43,6 +43,7 @@ NIC_F_VF_MAC = BIT(15), NIC_F_RATE_LIMIT = BIT(16), NIC_F_RXQ_RECOVERY = BIT(17), + NIC_F_TX_COMPACT_WQE_OL = BIT(19), NIC_F_RX_COMPACT_CQE = BIT(20), NIC_F_HTN_CMDQ = BIT(21), };
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/net/ethernet/huawei/hinic3/hinic3_mt.h
Changed
@@ -4,7 +4,7 @@ #ifndef HINIC3_MT_H #define HINIC3_MT_H -#define HINIC3_DRV_NAME "hisdk3" +#define HINIC3_DRV_NAME "hinic3" #define HINIC3_CHIP_NAME "hinic" /* Interrupt at most records, interrupt will be recorded in the FFM */
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/net/ethernet/huawei/hinic3/hinic3_nic_cfg.h
Changed
@@ -98,6 +98,7 @@ #define HINIC3_SUPPORT_ALLMULTI(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, ALLMULTI) #define HINIC3_SUPPORT_VF_MAC(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, VF_MAC) #define HINIC3_SUPPORT_RATE_LIMIT(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, RATE_LIMIT) +#define HINIC3_SUPPORT_TX_COMPACT_WQE_OL(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, TX_COMPACT_WQE_OL) #define HINIC3_SUPPORT_RX_COMPACT_CQE(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, RX_COMPACT_CQE) #define HINIC3_SUPPORT_RXQ_RECOVERY(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, RXQ_RECOVERY)
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/net/ethernet/huawei/hinic3/hinic3_nic_dev.h
Changed
@@ -247,6 +247,7 @@ struct hinic3_txq *txqs; struct hinic3_rxq *rxqs; struct hinic3_dyna_txrxq_params q_params; + u8 cqe_mode; /* rx_cqe */ u16 num_qp_irq; struct irq_info *qps_irq_info;
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/net/ethernet/huawei/hinic3/hinic3_nic_io.c
Changed
@@ -35,7 +35,7 @@ static unsigned char rq_wqe_type = HINIC3_NORMAL_RQ_WQE; module_param(rq_wqe_type, byte, 0444); -MODULE_PARM_DESC(rq_wqe_type, "RQ WQE type 0-8Bytes, 1-16Bytes, 2-32Bytes (default=2)"); +MODULE_PARM_DESC(rq_wqe_type, "RQ WQE type 0-8Bytes, 1-16Bytes, 2-32Bytes (default=1)"); /*lint +e806*/ static u32 tx_drop_thd_on = HINIC3_DEAULT_DROP_THD_ON; @@ -274,8 +274,15 @@ /* rq_wqe_type is the configuration when the driver is installed, * but it may not be the actual configuration. */ - if (rq_wqe_type != HINIC3_NORMAL_RQ_WQE && rq_wqe_type != HINIC3_EXTEND_RQ_WQE) - return HINIC3_NORMAL_RQ_WQE; + if (HINIC3_SUPPORT_RX_COMPACT_CQE(hwdev)) { + if (rq_wqe_type != HINIC3_COMPACT_RQ_WQE && rq_wqe_type != HINIC3_NORMAL_RQ_WQE && + rq_wqe_type != HINIC3_EXTEND_RQ_WQE) { + return HINIC3_NORMAL_RQ_WQE; + } + } else { + if (rq_wqe_type != HINIC3_NORMAL_RQ_WQE && rq_wqe_type != HINIC3_EXTEND_RQ_WQE) + return HINIC3_NORMAL_RQ_WQE; + } return rq_wqe_type; } @@ -289,7 +296,7 @@ rq->msix_entry_idx = rq_msix_idx; err = hinic3_wq_create(nic_io->hwdev, &rq->wq, rq_depth, - (u16)BIT(HINIC3_RQ_WQEBB_SHIFT + rq_wqe_type)); + (u16)BIT(HINIC3_RQ_WQEBB_SHIFT + rq->wqe_type)); if (err) { sdk_err(nic_io->dev_hdl, "Failed to create rx queue(%u) wq\n", q_id); @@ -774,6 +781,10 @@ RQ_CTXT_WQ_PAGE_SET(2, WQE_TYPE); rq_ctxt->cqe_sge_len = RQ_CTXT_CQE_LEN_SET(1, CQE_LEN); break; + case HINIC3_COMPACT_RQ_WQE: + /* use 8Byte WQE */ + rq_ctxt->wq_pfn_hi_type_owner |= RQ_CTXT_WQ_PAGE_SET(3, WQE_TYPE); + break; default: pr_err("Invalid rq wqe type: %u", wqe_type); } @@ -985,6 +996,10 @@ rq_attr.intr_idx = nic_io->rqq_id.msix_entry_idx; rq_attr.l2nic_rqn = q_id; rq_attr.cqe_type = 0; + if (hinic3_get_rq_wqe_type(nic_io->hwdev) == HINIC3_COMPACT_RQ_WQE) { + rq_attr.cqe_type = 1; + rq_attr.ci_dma_base = HINIC3_CI_PADDR(nic_io->rq_ci_dma_base, q_id); + } err = hinic3_set_rq_ci_ctx(nic_io, &rq_attr); if (err != 0) {
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/net/ethernet/huawei/hinic3/hinic3_nic_io.h
Changed
@@ -47,7 +47,9 @@ }; struct hinic3_tx_rx_ops { - void (*rx_get_cqe_info)(void *rx_cqe, void *cqe_info); + void (*tx_set_wqebb_cnt)(void *wqe_combo, u32 offload, u16 num_sge); + void (*tx_set_wqe_task)(void *wqe_combo, void *offload_info); + void (*rx_get_cqe_info)(void *rx_cqe, void *cqe_info, u8 cqe_mode); bool (*rx_cqe_done)(void *rxq, void **rx_cqe); }; @@ -334,4 +336,5 @@ void hinic3_deinit_qps(void *hwdev, struct hinic3_dyna_qp_params *qp_params); int hinic3_init_nicio_res(void *hwdev); void hinic3_deinit_nicio_res(void *hwdev); +int hinic3_get_rq_wqe_type(void *hwdev); #endif
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/net/ethernet/huawei/hinic3/hinic3_nic_qp.h
Changed
@@ -194,6 +194,9 @@ }; struct hinic3_cqe_info { + u8 pkt_offset; + u8 rsvd3; + u8 lro_num; u8 vlan_offload; u8 pkt_fmt; @@ -230,8 +233,14 @@ u32 cqe_lo_addr; }; +struct hinic3_rq_compact_wqe { + u32 buf_hi_addr; + u32 buf_lo_addr; +}; + struct hinic3_rq_wqe { union { + struct hinic3_rq_compact_wqe compact_wqe; struct hinic3_rq_normal_wqe normal_wqe; struct hinic3_rq_extend_wqe extend_wqe; }; @@ -284,9 +293,13 @@ struct hinic3_sq_task *task; struct hinic3_sq_bufdesc *bds_head; struct hinic3_sq_bufdesc *bds_sec2; + u16 first_bds_num; - u32 wqe_type; - u32 task_type; + u8 wqe_type; + u8 task_type; + + u16 wqebb_cnt; + u8 rsvd2; }; /* ************* SQ_CTRL ************** */ @@ -300,10 +313,37 @@ }; enum sq_wqe_tasksect_len_type { - SQ_WQE_TASKSECT_46BITS = 0, + SQ_WQE_TASKSECT_4BYTES = 0, SQ_WQE_TASKSECT_16BYTES = 1, }; +struct hinic3_offload_info { + u8 encapsulation; + u8 esp_next_proto; + u8 inner_l4_en; + u8 inner_l3_en; + u8 out_l4_en; + u8 out_l3_en; + u8 ipsec_offload; + u8 pkt_1588; + u8 vlan_sel; + u8 vlan_valid; + u16 vlan1_tag; + u32 ip_identify; +}; + +struct hinic3_queue_info { + u8 pri; + u8 uc; + u8 sctp; + u8 udp_dp_en; + u8 tso; + u8 ufo; + u8 payload_offset; + u8 pkt_type; + u16 mss; +}; + #define SQ_CTRL_BD0_LEN_SHIFT 0 #define SQ_CTRL_RSVD_SHIFT 18 #define SQ_CTRL_BUFDESC_NUM_SHIFT 19 @@ -335,7 +375,7 @@ #define SQ_CTRL_QUEUE_INFO_PLDOFF_SHIFT 2 #define SQ_CTRL_QUEUE_INFO_UFO_SHIFT 10 #define SQ_CTRL_QUEUE_INFO_TSO_SHIFT 11 -#define SQ_CTRL_QUEUE_INFO_TCPUDP_CS_SHIFT 12 +#define SQ_CTRL_QUEUE_INFO_UDP_DP_EN_SHIFT 12 #define SQ_CTRL_QUEUE_INFO_MSS_SHIFT 13 #define SQ_CTRL_QUEUE_INFO_SCTP_SHIFT 27 #define SQ_CTRL_QUEUE_INFO_UC_SHIFT 28 @@ -345,7 +385,7 @@ #define SQ_CTRL_QUEUE_INFO_PLDOFF_MASK 0xFFU #define SQ_CTRL_QUEUE_INFO_UFO_MASK 0x1U #define SQ_CTRL_QUEUE_INFO_TSO_MASK 0x1U -#define SQ_CTRL_QUEUE_INFO_TCPUDP_CS_MASK 0x1U +#define SQ_CTRL_QUEUE_INFO_UDP_DP_EN_MASK 0x1U #define SQ_CTRL_QUEUE_INFO_MSS_MASK 0x3FFFU #define SQ_CTRL_QUEUE_INFO_SCTP_MASK 0x1U #define SQ_CTRL_QUEUE_INFO_UC_MASK 0x1U @@ -363,6 +403,61 @@ ((val) & (~(SQ_CTRL_QUEUE_INFO_##member##_MASK << \ SQ_CTRL_QUEUE_INFO_##member##_SHIFT))) +#define SQ_CTRL_15BIT_QUEUE_INFO_PKT_TYPE_SHIFT 14 +#define SQ_CTRL_15BIT_QUEUE_INFO_PLDOFF_SHIFT 16 +#define SQ_CTRL_15BIT_QUEUE_INFO_UFO_SHIFT 24 +#define SQ_CTRL_15BIT_QUEUE_INFO_TSO_SHIFT 25 +#define SQ_CTRL_15BIT_QUEUE_INFO_UDP_DP_EN_SHIFT 26 +#define SQ_CTRL_15BIT_QUEUE_INFO_SCTP_SHIFT 27 + +#define SQ_CTRL_15BIT_QUEUE_INFO_PKT_TYPE_MASK 0x3U +#define SQ_CTRL_15BIT_QUEUE_INFO_PLDOFF_MASK 0xFFU +#define SQ_CTRL_15BIT_QUEUE_INFO_UFO_MASK 0x1U +#define SQ_CTRL_15BIT_QUEUE_INFO_TSO_MASK 0x1U +#define SQ_CTRL_15BIT_QUEUE_INFO_UDP_DP_EN_MASK 0x1U +#define SQ_CTRL_15BIT_QUEUE_INFO_SCTP_MASK 0x1U + +#define SQ_CTRL_15BIT_QUEUE_INFO_SET(val, member) \ + (((u32)(val) & SQ_CTRL_15BIT_QUEUE_INFO_##member##_MASK) << \ + SQ_CTRL_15BIT_QUEUE_INFO_##member##_SHIFT) + +#define SQ_CTRL_15BIT_QUEUE_INFO_GET(val, member) \ + (((val) >> SQ_CTRL_15BIT_QUEUE_INFO_##member##_SHIFT) & \ + SQ_CTRL_15BIT_QUEUE_INFO_##member##_MASK) + +#define SQ_CTRL_15BIT_QUEUE_INFO_CLEAR(val, member) \ + ((val) & (~(SQ_CTRL_15BIT_QUEUE_INFO_##member##_MASK << \ + SQ_CTRL_15BIT_QUEUE_INFO_##member##_SHIFT))) + +#define SQ_TASK_INFO_PKT_1588_SHIFT 31 +#define SQ_TASK_INFO_IPSEC_PROTO_SHIFT 30 +#define SQ_TASK_INFO_OUT_L3_EN_SHIFT 28 +#define SQ_TASK_INFO_OUT_L4_EN_SHIFT 27 +#define SQ_TASK_INFO_INNER_L3_EN_SHIFT 25 +#define SQ_TASK_INFO_INNER_L4_EN_SHIFT 24 +#define SQ_TASK_INFO_ESP_NEXT_PROTO_SHIFT 22 +#define SQ_TASK_INFO_VLAN_VALID_SHIFT 19 +#define SQ_TASK_INFO_VLAN_SEL_SHIFT 16 +#define SQ_TASK_INFO_VLAN_TAG_SHIFT 0 + +#define SQ_TASK_INFO_PKT_1588_MASK 0x1U +#define SQ_TASK_INFO_IPSEC_PROTO_MASK 0x1U +#define SQ_TASK_INFO_OUT_L3_EN_MASK 0x1U +#define SQ_TASK_INFO_OUT_L4_EN_MASK 0x1U +#define SQ_TASK_INFO_INNER_L3_EN_MASK 0x1U +#define SQ_TASK_INFO_INNER_L4_EN_MASK 0x1U +#define SQ_TASK_INFO_ESP_NEXT_PROTO_MASK 0x3U +#define SQ_TASK_INFO_VLAN_VALID_MASK 0x1U +#define SQ_TASK_INFO_VLAN_SEL_MASK 0x7U +#define SQ_TASK_INFO_VLAN_TAG_MASK 0xFFFFU + +#define SQ_TASK_INFO_SET(val, member) \ + (((u32)(val) & SQ_TASK_INFO_##member##_MASK) << \ + SQ_TASK_INFO_##member##_SHIFT) +#define SQ_TASK_INFO_GET(val, member) \ + (((val) >> SQ_TASK_INFO_##member##_SHIFT) & \ + SQ_TASK_INFO_##member##_MASK) + #define SQ_TASK_INFO0_TUNNEL_FLAG_SHIFT 19 #define SQ_TASK_INFO0_ESP_NEXT_PROTO_SHIFT 22 #define SQ_TASK_INFO0_INNER_L4_EN_SHIFT 24 @@ -419,30 +514,4 @@ #define LLT_STATIC_DEF_SAVED #endif -static inline u32 hinic3_get_pkt_len_for_super_cqe(const struct hinic3_rq_cqe *cqe, - bool last) -{ - u32 pkt_len = hinic3_hw_cpu32(cqe->pkt_info); - - if (!last) - return RQ_CQE_PKT_LEN_GET(pkt_len, FIRST_LEN); - else - return RQ_CQE_PKT_LEN_GET(pkt_len, LAST_LEN); -} - -/* * - * hinic3_set_vlan_tx_offload - set vlan offload info - * @task: wqe task section - * @vlan_tag: vlan tag - * @vlan_type: 0--select TPID0 in IPSU, 1--select TPID0 in IPSU - * 2--select TPID2 in IPSU, 3--select TPID3 in IPSU, 4--select TPID4 in IPSU - */ -static inline void hinic3_set_vlan_tx_offload(struct hinic3_sq_task *task, - u16 vlan_tag, u8 vlan_type) -{ - task->vlan_offload = SQ_TASK_INFO3_SET(vlan_tag, VLAN_TAG) | - SQ_TASK_INFO3_SET(vlan_type, VLAN_TYPE) | - SQ_TASK_INFO3_SET(1U, VLAN_TAG_VALID); -} - #endif
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/net/ethernet/huawei/hinic3/hinic3_rx.c
Changed
@@ -113,7 +113,7 @@ /* use fixed len */ rq_wqe->extend_wqe.buf_desc.sge.len = nic_dev->rx_buff_len; - } else { + } else if (rxq->rq->wqe_type == HINIC3_NORMAL_RQ_WQE) { rq_wqe->normal_wqe.cqe_hi_addr = upper_32_bits(rx_info->cqe_dma); rq_wqe->normal_wqe.cqe_lo_addr = @@ -153,11 +153,16 @@ hinic3_hw_be32(upper_32_bits(dma_addr)); rq_wqe->extend_wqe.buf_desc.sge.lo_addr = hinic3_hw_be32(lower_32_bits(dma_addr)); - } else { + } else if (rxq->rq->wqe_type == HINIC3_NORMAL_RQ_WQE) { rq_wqe->normal_wqe.buf_hi_addr = hinic3_hw_be32(upper_32_bits(dma_addr)); rq_wqe->normal_wqe.buf_lo_addr = hinic3_hw_be32(lower_32_bits(dma_addr)); + } else { + rq_wqe->compact_wqe.buf_hi_addr = + hinic3_hw_be32(upper_32_bits(dma_addr)); + rq_wqe->compact_wqe.buf_lo_addr = + hinic3_hw_be32(lower_32_bits(dma_addr)); } rxq->next_to_update = (u16)((rxq->next_to_update + 1) & rxq->q_mask); } @@ -241,7 +246,7 @@ static bool hinic3_add_rx_frag(struct hinic3_rxq *rxq, struct hinic3_rx_info *rx_info, - struct sk_buff *skb, u32 size) + struct sk_buff *skb, u32 size, u8 offset) { struct page *page; u8 *va; @@ -260,7 +265,7 @@ DMA_FROM_DEVICE); if (size <= HINIC3_RX_HDR_SIZE && !skb_is_nonlinear(skb)) { - memcpy(__skb_put(skb, size), va, + memcpy(__skb_put(skb, size), va + offset, ALIGN(size, sizeof(long))); /*lint !e666*/ /* page is not reserved, we can reuse buffer as-is */ @@ -273,7 +278,7 @@ } skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page, - (int)rx_info->page_offset, (int)size, rxq->buf_len); + (int)(rx_info->page_offset + offset), (int)size, rxq->buf_len); /* avoid re-using remote pages */ if (unlikely(page_to_nid(page) != numa_node_id())) @@ -291,26 +296,30 @@ } static void packaging_skb(struct hinic3_rxq *rxq, struct sk_buff *head_skb, - u8 sge_num, u32 pkt_len) + u8 sge_num, u32 pkt_len, u8 pkt_offset) { struct hinic3_rx_info *rx_info = NULL; struct sk_buff *skb = NULL; u8 frag_num = 0; - u32 size; + u32 frag_size; u32 sw_ci; - u32 temp_pkt_len = pkt_len; - u8 temp_sge_num = sge_num; + u8 tmp_sge_num; + u32 tmp_pkt_len; + u8 tmp_pkt_offset; sw_ci = rxq->cons_idx & rxq->q_mask; skb = head_skb; - while (temp_sge_num) { + tmp_sge_num = sge_num; + tmp_pkt_len = pkt_len; + tmp_pkt_offset = pkt_offset; + while (tmp_sge_num) { rx_info = &rxq->rx_infosw_ci; sw_ci = (sw_ci + 1) & rxq->q_mask; - if (unlikely(temp_pkt_len > rxq->buf_len)) { - size = rxq->buf_len; - temp_pkt_len -= rxq->buf_len; + if (unlikely(tmp_pkt_len + tmp_pkt_offset > rxq->buf_len)) { + frag_size = rxq->buf_len - tmp_pkt_offset; + tmp_pkt_len -= frag_size; } else { - size = temp_pkt_len; + frag_size = tmp_pkt_len; } if (unlikely(frag_num == MAX_SKB_FRAGS)) { @@ -322,12 +331,12 @@ } if (unlikely(skb != head_skb)) { - head_skb->len += size; - head_skb->data_len += size; + head_skb->len += frag_size; + head_skb->data_len += frag_size; head_skb->truesize += rxq->buf_len; } - if (likely(hinic3_add_rx_frag(rxq, rx_info, skb, size))) { + if (likely(hinic3_add_rx_frag(rxq, rx_info, skb, frag_size, tmp_pkt_offset))) { hinic3_reuse_rx_page(rxq, rx_info); } else { /* we are not reusing the buffer so unmap it */ @@ -337,7 +346,8 @@ /* clear contents of buffer_info */ rx_info->buf_dma_addr = 0; rx_info->page = NULL; - temp_sge_num--; + tmp_sge_num--; + tmp_pkt_offset = 0; /* only first sge use offset */ frag_num++; } } @@ -354,6 +364,7 @@ struct sk_buff *skb = NULL; struct net_device *netdev = rxq->netdev; u32 pkt_len = cqe_info->pkt_len; + u8 pkt_offset = cqe_info->pkt_offset; u8 sge_num, skb_num; u16 wqebb_cnt = 0; @@ -361,7 +372,7 @@ if (unlikely(!head_skb)) return NULL; - sge_num = HINIC3_GET_SGE_NUM(pkt_len, rxq); + sge_num = HINIC3_GET_SGE_NUM(pkt_len + pkt_offset, rxq); if (likely(sge_num <= MAX_SKB_FRAGS)) skb_num = 1; else @@ -387,7 +398,7 @@ prefetchw(head_skb->data); wqebb_cnt = sge_num; - packaging_skb(rxq, head_skb, sge_num, pkt_len); + packaging_skb(rxq, head_skb, sge_num, pkt_len, pkt_offset); rxq->cons_idx += wqebb_cnt; rxq->delta += wqebb_cnt; @@ -857,7 +868,7 @@ (HINIC3_GET_RX_IP_TYPE(hinic3_hw_cpu32((cqe)->offload_type)) == \ HINIC3_RX_IPV6_PKT ? LRO_PKT_HDR_LEN_IPV6 : LRO_PKT_HDR_LEN_IPV4) -void hinic3_rx_get_cqe_info(void *rx_cqe, void *cqe_info) +void hinic3_rx_get_cqe_info(void *rx_cqe, void *cqe_info, u8 cqe_mode) { struct hinic3_rq_cqe *cqe = (struct hinic3_rq_cqe *)rx_cqe; struct hinic3_cqe_info *info = (struct hinic3_cqe_info *)cqe_info; @@ -880,15 +891,24 @@ info->rss_hash_value = dw3; } -void hinic3_rx_get_compact_cqe_info(void *rx_cqe, void *cqe_info) +void hinic3_rx_get_compact_cqe_info(void *rx_cqe, void *cqe_info, u8 cqe_mode) { struct hinic3_rq_cqe *cqe = (struct hinic3_rq_cqe *)rx_cqe; struct hinic3_cqe_info *info = (struct hinic3_cqe_info *)cqe_info; u32 dw0, dw1, dw2; - dw0 = hinic3_hw_cpu32(cqe->status); - dw1 = hinic3_hw_cpu32(cqe->vlan_len); - dw2 = hinic3_hw_cpu32(cqe->offload_type); + if (cqe_mode != HINIC3_RQ_CQE_INTEGRATE) { + dw0 = hinic3_hw_cpu32(cqe->status); + dw1 = hinic3_hw_cpu32(cqe->vlan_len); + dw2 = hinic3_hw_cpu32(cqe->offload_type); + } else { + /* When rx wqe is compact, cqe is integrated with packet by big endian, + * explicit endian conversion is needed. + */ + dw0 = be32_to_cpu(cqe->status); + dw1 = be32_to_cpu(cqe->vlan_len); + dw2 = be32_to_cpu(cqe->offload_type); + } info->cqe_type = RQ_COMPACT_CQE_STATUS_GET(dw0, CQE_TYPE); info->csum_err = RQ_COMPACT_CQE_STATUS_GET(dw0, CSUM_ERR); @@ -916,9 +936,28 @@ info->lro_num = RQ_COMPACT_CQE_OFFLOAD_GET(dw2, NUM_LRO); info->vlan_tag = RQ_COMPACT_CQE_OFFLOAD_GET(dw2, VLAN); } + if (info->cqe_type == HINIC3_RQ_CQE_INTEGRATE) { + info->pkt_offset = info->cqe_len == RQ_COMPACT_CQE_16BYTE ? + HINIC3_COMPACT_CQE_16B : HINIC3_COMPACT_CQE_8B; + } +} + +static bool rx_integrated_cqe_done(void *rx_queue, void **rx_cqe) +{ + u16 sw_ci; + u16 hw_ci; + struct hinic3_rxq *rxq = rx_queue; +
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/net/ethernet/huawei/hinic3/hinic3_rx.h
Changed
@@ -28,6 +28,12 @@ #define HINIC3_HEADER_DATA_UNIT 2 +#define HINIC3_COMPACT_CQE_8B 8 +#define HINIC3_COMPACT_CQE_16B 16 + +#define HINIC3_RQ_CQE_SEPARATE 0 +#define HINIC3_RQ_CQE_INTEGRATE 1 + struct hinic3_rxq_stats { u64 packets; u64 bytes; @@ -150,9 +156,9 @@ void hinic3_rxq_clean_stats(struct hinic3_rxq_stats *rxq_stats); -void hinic3_rx_get_cqe_info(void *rx_cqe, void *cqe_info); +void hinic3_rx_get_cqe_info(void *rx_cqe, void *cqe_info, u8 cqe_mode); -void hinic3_rx_get_compact_cqe_info(void *rx_cqe, void *cqe_info); +void hinic3_rx_get_compact_cqe_info(void *rx_cqe, void *cqe_info, u8 cqe_mode); void hinic3_rxq_check_work_handler(struct work_struct *work);
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/net/ethernet/huawei/hinic3/hinic3_tx.c
Changed
@@ -218,29 +218,51 @@ } } -static int hinic3_tx_csum(struct hinic3_txq *txq, struct hinic3_sq_task *task, - struct sk_buff *skb) +static void get_inner_l3_l4_type(struct sk_buff *skb, union hinic3_ip *ip, + union hinic3_l4 *l4, + enum sq_l3_type *l3_type, u8 *l4_proto) +{ + unsigned char *exthdr = NULL; + __be16 frag_off = 0; + + if (ip->v4->version == IP4_VERSION) { + *l3_type = IPV4_PKT_WITH_CHKSUM_OFFLOAD; + *l4_proto = ip->v4->protocol; + } else if (ip->v4->version == IP6_VERSION) { + *l3_type = IPV6_PKT; + exthdr = ip->hdr + sizeof(*ip->v6); + *l4_proto = ip->v6->nexthdr; + if (exthdr != l4->hdr) + ipv6_skip_exthdr(skb, (int)(exthdr - skb->data), + l4_proto, &frag_off); + } else { + *l3_type = UNKNOWN_L3TYPE; + *l4_proto = 0; + } +} + +static int hinic3_tx_csum(struct hinic3_txq *txq, struct sk_buff *skb, + struct hinic3_offload_info *offload_info, + struct hinic3_queue_info *queue_info) { if (skb->ip_summed != CHECKSUM_PARTIAL) return 0; if (skb->encapsulation) { union hinic3_ip ip; + union hinic3_l4 l4; u8 l4_proto; - task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, TUNNEL_FLAG); + offload_info->encapsulation = 1; ip.hdr = skb_network_header(skb); if (ip.v4->version == IPV4_VERSION) { l4_proto = ip.v4->protocol; + l4.hdr = skb_transport_header(skb); } else if (ip.v4->version == IPV6_VERSION) { - union hinic3_l4 l4; - unsigned char *exthdr; + unsigned char *exthdr = NULL; __be16 frag_off; -#ifdef HAVE_OUTER_IPV6_TUNNEL_OFFLOAD - task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, OUT_L4_EN); -#endif exthdr = ip.hdr + sizeof(*ip.v6); l4_proto = ip.v6->nexthdr; l4.hdr = skb_transport_header(skb); @@ -251,180 +273,139 @@ l4_proto = IPPROTO_RAW; } + if (l4_proto == IPPROTO_UDP) + queue_info->udp_dp_en = 1; + if (l4_proto != IPPROTO_UDP || ((struct udphdr *)skb_transport_header(skb))->dest != VXLAN_OFFLOAD_PORT_LE) { TXQ_STATS_INC(txq, unknown_tunnel_pkt); - /* Unsupport tunnel packet, disable csum offload */ skb_checksum_help(skb); return 0; } } - task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, INNER_L4_EN); + offload_info->inner_l4_en = 1; return 1; } -static void get_inner_l3_l4_type(struct sk_buff *skb, union hinic3_ip *ip, - union hinic3_l4 *l4, - enum sq_l3_type *l3_type, u8 *l4_proto) -{ - unsigned char *exthdr = NULL; - - if (ip->v4->version == IP4_VERSION) { - *l3_type = IPV4_PKT_WITH_CHKSUM_OFFLOAD; - *l4_proto = ip->v4->protocol; - -#ifdef HAVE_OUTER_IPV6_TUNNEL_OFFLOAD - /* inner_transport_header is wrong in centos7.0 and suse12.1 */ - l4->hdr = ip->hdr + ((u8)ip->v4->ihl << IP_HDR_IHL_UNIT_SHIFT); -#endif - } else if (ip->v4->version == IP6_VERSION) { - *l3_type = IPV6_PKT; - exthdr = ip->hdr + sizeof(*ip->v6); - *l4_proto = ip->v6->nexthdr; - if (exthdr != l4->hdr) { - __be16 frag_off = 0; -#ifndef HAVE_OUTER_IPV6_TUNNEL_OFFLOAD - ipv6_skip_exthdr(skb, (int)(exthdr - skb->data), - l4_proto, &frag_off); -#else - int pld_off = 0; - - pld_off = ipv6_skip_exthdr(skb, - (int)(exthdr - skb->data), - l4_proto, &frag_off); - l4->hdr = skb->data + pld_off; -#endif - } - } else { - *l3_type = UNKNOWN_L3TYPE; - *l4_proto = 0; - } -} - -static void hinic3_set_tso_info(struct hinic3_sq_task *task, u32 *queue_info, +static void hinic3_set_tso_info(struct hinic3_offload_info *offload_info, + struct hinic3_queue_info *queue_info, enum sq_l4offload_type l4_offload, u32 offset, u32 mss) { if (l4_offload == TCP_OFFLOAD_ENABLE) { - *queue_info |= SQ_CTRL_QUEUE_INFO_SET(1U, TSO); - task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, INNER_L4_EN); + queue_info->tso = 1; + offload_info->inner_l4_en = 1; } else if (l4_offload == UDP_OFFLOAD_ENABLE) { - *queue_info |= SQ_CTRL_QUEUE_INFO_SET(1U, UFO); - task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, INNER_L4_EN); + queue_info->ufo = 1; + offload_info->inner_l4_en = 1; } /* Default enable L3 calculation */ - task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, INNER_L3_EN); + offload_info->inner_l3_en = 1; - *queue_info |= SQ_CTRL_QUEUE_INFO_SET(offset >> 1, PLDOFF); + queue_info->payload_offset = (u8)(offset >> 1); /* set MSS value */ - *queue_info = SQ_CTRL_QUEUE_INFO_CLEAR(*queue_info, MSS); - *queue_info |= SQ_CTRL_QUEUE_INFO_SET(mss, MSS); + queue_info->mss = (u16)mss; } -static int hinic3_tso(struct hinic3_sq_task *task, u32 *queue_info, - struct sk_buff *skb) +static inline void hinic3_inner_tso_offload(struct hinic3_offload_info *offload_info, + struct hinic3_queue_info *queue_info, + struct sk_buff *skb, union hinic3_ip ip, + union hinic3_l4 l4) { - enum sq_l4offload_type l4_offload = OFFLOAD_DISABLE; + u8 l4_proto; + u32 offset = 0; enum sq_l3_type l3_type; + enum sq_l4offload_type l4_offload = OFFLOAD_DISABLE; + + get_inner_l3_l4_type(skb, &ip, &l4, &l3_type, &l4_proto); + + if (l4_proto == IPPROTO_TCP) + l4.tcp->check = ~csum_magic(&ip, IPPROTO_TCP); + + get_inner_l4_info(skb, &l4, l4_proto, &offset, &l4_offload); + + hinic3_set_tso_info(offload_info, queue_info, l4_offload, offset, + skb_shinfo(skb)->gso_size); +} + +static int hinic3_tso(struct hinic3_offload_info *offload_info, + struct hinic3_queue_info *queue_info, struct sk_buff *skb) +{ union hinic3_ip ip; union hinic3_l4 l4; - u32 offset = 0; u8 l4_proto; - int err; if (!skb_is_gso(skb)) return 0; - err = skb_cow_head(skb, 0); - if (err < 0) - return err; + if (skb_cow_head(skb, 0) < 0) + return -EINVAL; + l4.hdr = skb_transport_header(skb); + ip.hdr = skb_network_header(skb); if (skb->encapsulation) { u32 gso_type = skb_shinfo(skb)->gso_type; /* L3 checksum always enable */ - task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, OUT_L3_EN); - task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, TUNNEL_FLAG);
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/net/ethernet/huawei/hinic3/hinic3_tx.h
Changed
@@ -13,7 +13,7 @@ #include "hinic3_nic_qp.h" #include "hinic3_nic_io.h" -#define VXLAN_OFFLOAD_PORT_LE 46354 /* big end is 4789 */ +#define VXLAN_OFFLOAD_PORT_LE 0xb512 /* big end is 4789 */ #define COMPACET_WQ_SKB_MAX_LEN 16383 @@ -142,6 +142,14 @@ void hinic3_set_txq_cos(struct hinic3_nic_dev *nic_dev, u16 start_qid, u16 q_num, u8 cos); +void hinic3_tx_set_wqebb_cnt(void *wqe_combo, u32 offload, u16 num_sge); + +void hinic3_tx_set_compact_offload_wqebb_cnt(void *wqe_combo, u32 offload, u16 num_sge); + +void hinic3_tx_set_wqe_task(void *wqe_combo, void *offload_info); + +void hinic3_tx_set_compact_offload_wqe_task(void *wqe_combo, void *offload_info); + #ifdef static #undef static #define LLT_STATIC_DEF_SAVED
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/perf/Kconfig
Changed
@@ -180,6 +180,18 @@ Extension, which provides periodic sampling of operations in the CPU pipeline and reports this via the perf AUX interface. +config ARM64_BRBE + bool "Enable support for branch stack sampling using FEAT_BRBE" + depends on PERF_EVENTS && ARM64 && ARM_PMU + depends on !FUNCTION_ALIGNMENT_64B + default y + help + Enable perf support for Branch Record Buffer Extension (BRBE) which + records all branches taken in an execution path. This supports some + branch types and privilege based filtering. It captures additional + relevant information such as cycle count, misprediction and branch + type, branch privilege level etc. + config ARM_DMC620_PMU tristate "Enable PMU support for the ARM DMC-620 memory controller" depends on (ARM64 && ACPI) || COMPILE_TEST
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/perf/Makefile
Changed
@@ -18,6 +18,7 @@ obj-$(CONFIG_THUNDERX2_PMU) += thunderx2_pmu.o obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o obj-$(CONFIG_ARM_SPE_PMU) += arm_spe_pmu.o +obj-$(CONFIG_ARM64_BRBE) += arm_brbe.o obj-$(CONFIG_ARM_DMC620_PMU) += arm_dmc620_pmu.o obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) += marvell_cn10k_tad_pmu.o obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) += marvell_cn10k_ddr_pmu.o
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/perf/arm_brbe.c
Added
@@ -0,0 +1,1198 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Branch Record Buffer Extension Driver. + * + * Copyright (C) 2022-2023 ARM Limited + * + * Author: Anshuman Khandual <anshuman.khandual@arm.com> + */ +#include "arm_pmuv3_branch.h" + +#define BRBFCR_EL1_BRANCH_FILTERS (BRBFCR_EL1_DIRECT | \ + BRBFCR_EL1_INDIRECT | \ + BRBFCR_EL1_RTN | \ + BRBFCR_EL1_INDCALL | \ + BRBFCR_EL1_DIRCALL | \ + BRBFCR_EL1_CONDDIR) + +#define BRBFCR_EL1_CONFIG_MASK (BRBFCR_EL1_BANK_MASK | \ + BRBFCR_EL1_PAUSED | \ + BRBFCR_EL1_EnI | \ + BRBFCR_EL1_BRANCH_FILTERS) + +/* + * BRBTS_EL1 is currently not used for branch stack implementation + * purpose but BRBCR_ELx.TS needs to have a valid value from all + * available options. BRBCR_ELx_TS_VIRTUAL is selected for this. + */ +#define BRBCR_ELx_DEFAULT_TS FIELD_PREP(BRBCR_ELx_TS_MASK, BRBCR_ELx_TS_VIRTUAL) + +#define BRBCR_ELx_CONFIG_MASK (BRBCR_ELx_EXCEPTION | \ + BRBCR_ELx_ERTN | \ + BRBCR_ELx_CC | \ + BRBCR_ELx_MPRED | \ + BRBCR_ELx_ExBRE | \ + BRBCR_ELx_E0BRE | \ + BRBCR_ELx_FZP | \ + BRBCR_ELx_TS_MASK) + +/* + * BRBE Buffer Organization + * + * BRBE buffer is arranged as multiple banks of 32 branch record + * entries each. An individual branch record in a given bank could + * be accessed, after selecting the bank in BRBFCR_EL1.BANK and + * accessing the registers i.e BRBSRC, BRBTGT, BRBINF set with + * indices 0..31. + * + * Bank 0 + * + * --------------------------------- ------ + * | 00 | BRBSRC | BRBTGT | BRBINF | | 00 | + * --------------------------------- ------ + * | 01 | BRBSRC | BRBTGT | BRBINF | | 01 | + * --------------------------------- ------ + * | .. | BRBSRC | BRBTGT | BRBINF | | .. | + * --------------------------------- ------ + * | 31 | BRBSRC | BRBTGT | BRBINF | | 31 | + * --------------------------------- ------ + * + * Bank 1 + * + * --------------------------------- ------ + * | 32 | BRBSRC | BRBTGT | BRBINF | | 00 | + * --------------------------------- ------ + * | 33 | BRBSRC | BRBTGT | BRBINF | | 01 | + * --------------------------------- ------ + * | .. | BRBSRC | BRBTGT | BRBINF | | .. | + * --------------------------------- ------ + * | 63 | BRBSRC | BRBTGT | BRBINF | | 31 | + * --------------------------------- ------ + */ +#define BRBE_BANK_MAX_ENTRIES 32 +#define BRBE_MAX_BANK 2 +#define BRBE_MAX_ENTRIES (BRBE_BANK_MAX_ENTRIES * BRBE_MAX_BANK) + +#define BRBE_BANK0_IDX_MIN 0 +#define BRBE_BANK0_IDX_MAX 31 +#define BRBE_BANK1_IDX_MIN 32 +#define BRBE_BANK1_IDX_MAX 63 + +struct brbe_regset { + unsigned long brbsrc; + unsigned long brbtgt; + unsigned long brbinf; +}; + +#define PERF_BR_ARM64_MAX (PERF_BR_MAX + PERF_BR_NEW_MAX) + +struct arm64_perf_task_context { + struct brbe_regset storeBRBE_MAX_ENTRIES; + int nr_brbe_records; + + /* + * Branch Filter Mask + * + * This mask represents all branch record types i.e PERF_BR_XXX + * (as defined in core perf ABI) that can be generated with the + * event's branch_sample_type request. The mask layout could be + * found here. Although the bit 15 i.e PERF_BR_EXTEND_ABI never + * gets set in the mask. + * + * 23 (PERF_BR_MAX + PERF_BR_NEW_MAX) 0 + * | | + * --------------------------------------------------------- + * | Extended ABI section | X | ABI section | + * --------------------------------------------------------- + */ + DECLARE_BITMAP(br_type_mask, PERF_BR_ARM64_MAX); +}; + +static void branch_mask_set_all(unsigned long *event_type_mask) +{ + int idx; + + for (idx = PERF_BR_UNKNOWN; idx < PERF_BR_EXTEND_ABI; idx++) + set_bit(idx, event_type_mask); + + for (idx = PERF_BR_NEW_FAULT_ALGN; idx < PERF_BR_NEW_MAX; idx++) + set_bit(PERF_BR_MAX + idx, event_type_mask); +} + +static void branch_mask_set_arch(unsigned long *event_type_mask) +{ + set_bit(PERF_BR_MAX + PERF_BR_NEW_FAULT_ALGN, event_type_mask); + set_bit(PERF_BR_MAX + PERF_BR_NEW_FAULT_DATA, event_type_mask); + set_bit(PERF_BR_MAX + PERF_BR_NEW_FAULT_INST, event_type_mask); + + set_bit(PERF_BR_MAX + PERF_BR_ARM64_FIQ, event_type_mask); + set_bit(PERF_BR_MAX + PERF_BR_ARM64_DEBUG_HALT, event_type_mask); + set_bit(PERF_BR_MAX + PERF_BR_ARM64_DEBUG_EXIT, event_type_mask); + set_bit(PERF_BR_MAX + PERF_BR_ARM64_DEBUG_INST, event_type_mask); + set_bit(PERF_BR_MAX + PERF_BR_ARM64_DEBUG_DATA, event_type_mask); +} + +static void branch_entry_mask(struct perf_branch_entry *entry, + unsigned long *event_type_mask) +{ + u64 idx; + + bitmap_zero(event_type_mask, PERF_BR_ARM64_MAX); + for (idx = PERF_BR_UNKNOWN; idx < PERF_BR_EXTEND_ABI; idx++) { + if (entry->type == idx) + set_bit(idx, event_type_mask); + } + + if (entry->type == PERF_BR_EXTEND_ABI) { + for (idx = PERF_BR_NEW_FAULT_ALGN; idx < PERF_BR_NEW_MAX; idx++) { + if (entry->new_type == idx) + set_bit(PERF_BR_MAX + idx, event_type_mask); + } + } +} + +static void prepare_event_branch_type_mask(struct perf_event *event, + unsigned long *event_type_mask) +{ + u64 branch_sample = event->attr.branch_sample_type; + + bitmap_zero(event_type_mask, PERF_BR_ARM64_MAX); + + /* + * The platform specific branch types might not follow event's + * branch filter requests accurately. Let's add all of them as + * acceptible branch types during the filtering process. + */ + branch_mask_set_arch(event_type_mask); + + if (branch_sample & PERF_SAMPLE_BRANCH_ANY) { + branch_mask_set_all(event_type_mask); + return; + } + + if (branch_sample & PERF_SAMPLE_BRANCH_IND_JUMP) + set_bit(PERF_BR_IND, event_type_mask); + + set_bit(PERF_BR_UNCOND, event_type_mask); + if (branch_sample & PERF_SAMPLE_BRANCH_COND) { + clear_bit(PERF_BR_UNCOND, event_type_mask); + set_bit(PERF_BR_COND, event_type_mask); + } + + if (branch_sample & PERF_SAMPLE_BRANCH_CALL) + set_bit(PERF_BR_CALL, event_type_mask); + + if (branch_sample & PERF_SAMPLE_BRANCH_IND_CALL) + set_bit(PERF_BR_IND_CALL, event_type_mask); + + if (branch_sample & PERF_SAMPLE_BRANCH_ANY_CALL) { + set_bit(PERF_BR_CALL, event_type_mask); + set_bit(PERF_BR_IRQ, event_type_mask); + set_bit(PERF_BR_SYSCALL, event_type_mask); + set_bit(PERF_BR_SERROR, event_type_mask); + + if (branch_sample & PERF_SAMPLE_BRANCH_COND) + set_bit(PERF_BR_COND_CALL, event_type_mask); + } + + if (branch_sample & PERF_SAMPLE_BRANCH_ANY_RETURN) { + set_bit(PERF_BR_RET, event_type_mask);
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/perf/arm_pmu.c
Changed
@@ -289,6 +289,23 @@ { struct arm_pmu *armpmu = to_arm_pmu(event->pmu); struct hw_perf_event *hwc = &event->hw; + struct pmu_hw_events *cpuc = this_cpu_ptr(armpmu->hw_events); + int idx; + + /* + * Merge all branch filter requests from different perf + * events being added into this PMU. This includes both + * privilege and branch type filters. + */ + if (armpmu->has_branch_stack) { + cpuc->branch_sample_type = 0; + for (idx = 0; idx < ARMPMU_MAX_HWEVENTS; idx++) { + struct perf_event *event_idx = cpuc->eventsidx; + + if (event_idx && has_branch_stack(event_idx)) + cpuc->branch_sample_type |= event_idx->attr.branch_sample_type; + } + } /* * ARM pmu always has to reprogram the period, so ignore @@ -317,6 +334,9 @@ struct hw_perf_event *hwc = &event->hw; int idx = hwc->idx; + if (has_branch_stack(event)) + armpmu->branch_stack_del(event, hw_events); + armpmu_stop(event, PERF_EF_UPDATE); hw_events->eventsidx = NULL; armpmu->clear_event_idx(hw_events, event); @@ -342,6 +362,9 @@ if (idx < 0) return idx; + if (has_branch_stack(event)) + armpmu->branch_stack_add(event, hw_events); + /* * If there is an event in the counter we are going to use then make * sure it is disabled. @@ -511,13 +534,25 @@ !cpumask_test_cpu(event->cpu, &armpmu->supported_cpus)) return -ENOENT; - /* does not support taken branch sampling */ - if (has_branch_stack(event)) - return -EOPNOTSUPP; + if (has_branch_stack(event)) { + if (!armpmu->has_branch_stack) + return -EOPNOTSUPP; + + if (!armpmu->branch_stack_init(event)) + return -EOPNOTSUPP; + } return __hw_perf_event_init(event); } +static void armpmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in) +{ + struct arm_pmu *armpmu = to_arm_pmu(pmu_ctx->pmu); + + if (armpmu->sched_task) + armpmu->sched_task(pmu_ctx, sched_in); +} + static void armpmu_enable(struct pmu *pmu) { struct arm_pmu *armpmu = to_arm_pmu(pmu); @@ -881,6 +916,7 @@ } pmu->pmu = (struct pmu) { + .sched_task = armpmu_sched_task, .pmu_enable = armpmu_enable, .pmu_disable = armpmu_disable, .event_init = armpmu_event_init,
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/perf/arm_pmuv3.c
Changed
@@ -26,6 +26,7 @@ #include <linux/nmi.h> #include <asm/arm_pmuv3.h> +#include "arm_pmuv3_branch.h" /* ARMv8 Cortex-A53 specific event types. */ #define ARMV8_A53_PERFCTR_PREF_LINEFILL 0xC2 @@ -834,14 +835,70 @@ armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E); kvm_vcpu_pmu_resync_el0(); + if (cpu_pmu->has_branch_stack) + armv8pmu_branch_enable(cpu_pmu); } static void armv8pmu_stop(struct arm_pmu *cpu_pmu) { + if (cpu_pmu->has_branch_stack) + armv8pmu_branch_disable(); + /* Disable all counters */ armv8pmu_pmcr_write(armv8pmu_pmcr_read() & ~ARMV8_PMU_PMCR_E); } +static void read_branch_records(struct pmu_hw_events *cpuc, + struct perf_event *event, + struct perf_sample_data *data, + bool *branch_captured) +{ + struct branch_records event_records; + + /* + * CPU specific branch records buffer must have been allocated already + * for the hardware records to be captured and processed further. + */ + if (WARN_ON(!cpuc->branches)) + return; + + /* + * When the current task context does not match with the PMU overflown + * event, the captured branch records here cannot be co-related to the + * overflowed event. Report to the user - as if no branch records have + * been captured, and flush branch records. + */ + if (event->ctx->task && (cpuc->branch_context != event->ctx)) + return; + + /* + * Read the branch records from the hardware once after the PMU IRQ + * has been triggered but subsequently same records can be used for + * other events that might have been overflowed simultaneously thus + * saving much CPU cycles. + */ + if (!*branch_captured) { + armv8pmu_branch_read(cpuc, event); + *branch_captured = true; + } + + /* + * Filter captured branch records + * + * PMU captured branch records would contain samples applicable for + * the aggregated branch filters, for all events that got scheduled + * on this PMU simultaneously. Hence these branch records need to + * be filtered first so that each individual event get samples they + * had requested originally. + */ + if (cpuc->branch_sample_type != event->attr.branch_sample_type) { + arm64_filter_branch_records(cpuc, event, &event_records); + perf_sample_save_brstack(data, event, &event_records.branch_stack, NULL); + return; + } + perf_sample_save_brstack(data, event, &cpuc->branches->branch_stack, NULL); +} + static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu) { u32 pmovsr; @@ -849,6 +906,7 @@ struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events); struct pt_regs *regs; int idx; + bool branch_captured = false; /* * Get and reset the IRQ flags @@ -893,6 +951,13 @@ continue; /* + * PMU IRQ should remain asserted until all branch records + * are captured and processed into struct perf_sample_data. + */ + if (has_branch_stack(event) && cpu_pmu->has_branch_stack) + read_branch_records(cpuc, event, &data, &branch_captured); + + /* * Perf event overflow will queue the processing of the event as * an irq_work which will be taken care of in the handling of * IPI_IRQ_WORK. @@ -901,6 +966,8 @@ cpu_pmu->disable(event); } armv8pmu_start(cpu_pmu); + if (cpu_pmu->has_branch_stack) + armv8pmu_branch_stack_reset(); return IRQ_HANDLED; } @@ -991,6 +1058,40 @@ return event->hw.idx; } +static bool armv8pmu_branch_stack_init(struct perf_event *event) +{ + if (armv8pmu_branch_attr_valid(event)) { + /* + * If a task gets scheduled out, the current branch records + * get saved in the task's context data, which can be later + * used to fill in the records upon an event overflow. Let's + * enable PERF_ATTACH_TASK_DATA in 'event->attach_state' for + * all branch stack sampling perf events. + */ + event->attach_state |= PERF_ATTACH_TASK_DATA; + return true; + } + return false; +} + +static void armv8pmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in) +{ + struct arm_pmu *armpmu = to_arm_pmu(pmu_ctx->pmu); + void *task_ctx = pmu_ctx->task_ctx_data; + + if (armpmu->has_branch_stack) { + /* Save branch records in task_ctx on sched out */ + if (task_ctx && !sched_in) { + armv8pmu_branch_save(armpmu, task_ctx); + return; + } + + /* Reset branch records on sched in */ + if (sched_in) + armv8pmu_branch_stack_reset(); + } +} + /* * Add an event filter to a given event. */ @@ -1083,6 +1184,9 @@ pmcr |= ARMV8_PMU_PMCR_LP; armv8pmu_pmcr_write(pmcr); + + if (cpu_pmu->has_branch_stack) + armv8pmu_branch_stack_reset(); } static int __armv8_pmuv3_map_event_id(struct arm_pmu *armpmu, @@ -1235,6 +1339,41 @@ cpu_pmu->reg_pmmir = read_pmmir(); else cpu_pmu->reg_pmmir = 0; + + /* + * BRBE is being probed on a single cpu for a + * given PMU. The remaining cpus, are assumed + * to have the exact same BRBE implementation. + */ + armv8pmu_branch_probe(cpu_pmu); +} + +static int branch_records_alloc(struct arm_pmu *armpmu) +{ + struct branch_records __percpu *records; + int cpu; + + records = alloc_percpu_gfp(struct branch_records, GFP_KERNEL); + if (!records) + return -ENOMEM; + + /* + * percpu memory allocated for 'records' gets completely consumed + * here, and never required to be freed up later. So permanently + * losing access to this anchor i.e 'records' is acceptable. + * + * Otherwise this allocation handle would have to be saved up for + * free_percpu() release later if required. + */ + for_each_possible_cpu(cpu) { + struct pmu_hw_events *events_cpu; + struct branch_records *records_cpu; + + events_cpu = per_cpu_ptr(armpmu->hw_events, cpu); + records_cpu = per_cpu_ptr(records, cpu); + events_cpu->branches = records_cpu; + } + return 0;
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/perf/arm_pmuv3_branch.h
Added
@@ -0,0 +1,83 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Branch Record Buffer Extension Helpers. + * + * Copyright (C) 2022-2023 ARM Limited + * + * Author: Anshuman Khandual <anshuman.khandual@arm.com> + */ +#include <linux/perf/arm_pmu.h> + +#ifdef CONFIG_ARM64_BRBE +void armv8pmu_branch_stack_add(struct perf_event *event, struct pmu_hw_events *cpuc); +void armv8pmu_branch_stack_del(struct perf_event *event, struct pmu_hw_events *cpuc); +void armv8pmu_branch_stack_reset(void); +void armv8pmu_branch_probe(struct arm_pmu *arm_pmu); +bool armv8pmu_branch_attr_valid(struct perf_event *event); +void armv8pmu_branch_enable(struct arm_pmu *arm_pmu); +void armv8pmu_branch_disable(void); +void armv8pmu_branch_read(struct pmu_hw_events *cpuc, + struct perf_event *event); +void arm64_filter_branch_records(struct pmu_hw_events *cpuc, + struct perf_event *event, + struct branch_records *event_records); +void armv8pmu_branch_save(struct arm_pmu *arm_pmu, void *ctx); +int armv8pmu_task_ctx_cache_alloc(struct arm_pmu *arm_pmu); +void armv8pmu_task_ctx_cache_free(struct arm_pmu *arm_pmu); +#else +static inline void armv8pmu_branch_stack_add(struct perf_event *event, struct pmu_hw_events *cpuc) +{ +} + +static inline void armv8pmu_branch_stack_del(struct perf_event *event, struct pmu_hw_events *cpuc) +{ +} + +static inline void armv8pmu_branch_stack_reset(void) +{ +} + +static inline void armv8pmu_branch_probe(struct arm_pmu *arm_pmu) +{ +} + +static inline bool armv8pmu_branch_attr_valid(struct perf_event *event) +{ + WARN_ON_ONCE(!has_branch_stack(event)); + return false; +} + +static inline void armv8pmu_branch_enable(struct arm_pmu *arm_pmu) +{ +} + +static inline void armv8pmu_branch_disable(void) +{ +} + +static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, + struct perf_event *event) +{ + WARN_ON_ONCE(!has_branch_stack(event)); +} + +static inline void arm64_filter_branch_records(struct pmu_hw_events *cpuc, + struct perf_event *event, + struct branch_records *event_records) +{ + +} + +static inline void armv8pmu_branch_save(struct arm_pmu *arm_pmu, void *ctx) +{ +} + +static inline int armv8pmu_task_ctx_cache_alloc(struct arm_pmu *arm_pmu) +{ + return 0; +} + +static inline void armv8pmu_task_ctx_cache_free(struct arm_pmu *arm_pmu) +{ +} +#endif
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/soc/hisilicon/Kconfig
Changed
@@ -49,9 +49,12 @@ interconnection bus protocol. The performance of application may be affected if some HCCS ports are not in full lane status, have a large number of CRC - errors and so on. + errors and so on. This may support for reducing system power + consumption if there are HCCS ports supported low power feature + on platform. Say M here if you want to include support for querying the - health status and port information of HCCS on Kunpeng SoC. + health status and port information of HCCS, or reducing system + power consumption on Kunpeng SoC. endmenu
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/soc/hisilicon/kunpeng_hccs.c
Changed
@@ -21,11 +21,22 @@ * - if all enabled ports are in linked * - if all linked ports are in full lane * - CRC error count sum + * + * - Retrieve all HCCS types used on the platform. + * + * - Support low power feature for all specified HCCS type ports, and + * provide the following interface: + * - query HCCS types supported increasing and decreasing lane number. + * - decrease lane number of all specified HCCS type ports on idle state. + * - increase lane number of all specified HCCS type ports. */ #include <linux/acpi.h> +#include <linux/delay.h> #include <linux/iopoll.h> #include <linux/platform_device.h> +#include <linux/stringify.h> #include <linux/sysfs.h> +#include <linux/types.h> #include <acpi/pcc.h> @@ -53,6 +64,42 @@ return container_of(k, struct hccs_chip_info, kobj); } +static struct hccs_dev *device_kobj_to_hccs_dev(struct kobject *k) +{ + struct device *dev = container_of(k, struct device, kobj); + struct platform_device *pdev = + container_of(dev, struct platform_device, dev); + + return platform_get_drvdata(pdev); +} + +static char *hccs_port_type_to_name(struct hccs_dev *hdev, u8 type) +{ + u16 i; + + for (i = 0; i < hdev->used_type_num; i++) { + if (hdev->type_name_mapsi.type == type) + return hdev->type_name_mapsi.name; + } + + return NULL; +} + +static int hccs_name_to_port_type(struct hccs_dev *hdev, + const char *name, u8 *type) +{ + u16 i; + + for (i = 0; i < hdev->used_type_num; i++) { + if (strcmp(hdev->type_name_mapsi.name, name) == 0) { + *type = hdev->type_name_mapsi.type; + return 0; + } + } + + return -EINVAL; +} + struct hccs_register_ctx { struct device *dev; u8 chan_id; @@ -144,7 +191,7 @@ pcc_chan = pcc_mbox_request_channel(cl, hdev->chan_id); if (IS_ERR(pcc_chan)) { - dev_err(dev, "PPC channel request failed.\n"); + dev_err(dev, "PCC channel request failed.\n"); rc = -ENODEV; goto out; } @@ -170,15 +217,21 @@ goto err_mbx_channel_free; } - if (pcc_chan->shmem_base_addr) { - cl_info->pcc_comm_addr = ioremap(pcc_chan->shmem_base_addr, - pcc_chan->shmem_size); - if (!cl_info->pcc_comm_addr) { - dev_err(dev, "Failed to ioremap PCC communication region for channel-%u.\n", - hdev->chan_id); - rc = -ENOMEM; - goto err_mbx_channel_free; - } + if (!pcc_chan->shmem_base_addr || + pcc_chan->shmem_size != HCCS_PCC_SHARE_MEM_BYTES) { + dev_err(dev, "The base address or size (%llu) of PCC communication region is invalid.\n", + pcc_chan->shmem_size); + rc = -EINVAL; + goto err_mbx_channel_free; + } + + cl_info->pcc_comm_addr = ioremap(pcc_chan->shmem_base_addr, + pcc_chan->shmem_size); + if (!cl_info->pcc_comm_addr) { + dev_err(dev, "Failed to ioremap PCC communication region for channel-%u.\n", + hdev->chan_id); + rc = -ENOMEM; + goto err_mbx_channel_free; } return 0; @@ -451,6 +504,7 @@ struct device *dev = hdev->dev; struct hccs_chip_info *chip; struct hccs_die_info *die; + bool has_die_info = false; u8 i, j; int ret; @@ -459,6 +513,7 @@ if (!chip->die_num) continue; + has_die_info = true; chip->dies = devm_kzalloc(hdev->dev, chip->die_num * sizeof(struct hccs_die_info), GFP_KERNEL); @@ -480,7 +535,7 @@ } } - return 0; + return has_die_info ? 0 : -EINVAL; } static int hccs_get_bd_info(struct hccs_dev *hdev, u8 opcode, @@ -586,7 +641,7 @@ port = &die->portsi; port->port_id = attrsi.port_id; port->port_type = attrsi.port_type; - port->lane_mode = attrsi.lane_mode; + port->max_lane_num = attrsi.max_lane_num; port->enable = attrsi.enable; port->die = die; } @@ -601,6 +656,7 @@ struct device *dev = hdev->dev; struct hccs_chip_info *chip; struct hccs_die_info *die; + bool has_port_info = false; u8 i, j; int ret; @@ -611,6 +667,7 @@ if (!die->port_num) continue; + has_port_info = true; die->ports = devm_kzalloc(dev, die->port_num * sizeof(struct hccs_port_info), GFP_KERNEL); @@ -629,7 +686,7 @@ } } - return 0; + return has_port_info ? 0 : -EINVAL; } static int hccs_get_hw_info(struct hccs_dev *hdev) @@ -660,6 +717,55 @@ return 0; } +static u16 hccs_calc_used_type_num(struct hccs_dev *hdev, + unsigned long *hccs_ver) +{ + struct hccs_chip_info *chip; + struct hccs_port_info *port; + struct hccs_die_info *die; + u16 used_type_num = 0; + u16 i, j, k; + + for (i = 0; i < hdev->chip_num; i++) { + chip = &hdev->chipsi; + for (j = 0; j < chip->die_num; j++) { + die = &chip->diesj; + for (k = 0; k < die->port_num; k++) { + port = &die->portsk; + set_bit(port->port_type, hccs_ver); + } + } + } + + for_each_set_bit(i, hccs_ver, HCCS_IP_MAX + 1) + used_type_num++; + + return used_type_num; +} + +static int hccs_init_type_name_maps(struct hccs_dev *hdev) +{ + DECLARE_BITMAP(hccs_ver, HCCS_IP_MAX + 1) = {}; + unsigned int i; + u16 idx = 0;
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/soc/hisilicon/kunpeng_hccs.h
Changed
@@ -10,6 +10,19 @@ * | P0 | P1 | P2 | P3 | P0 | P1 | P2 | P3 | P0 | P1 | P2 | P3 |P0 | P1 | P2 | P3 | */ +enum hccs_port_type { + HCCS_V1 = 1, + HCCS_V2, +}; + +#define HCCS_IP_PREFIX "HCCS-v" +#define HCCS_IP_MAX 255 +#define HCCS_NAME_MAX_LEN 9 +struct hccs_type_name_map { + u8 type; + char nameHCCS_NAME_MAX_LEN + 1; +}; + /* * This value cannot be 255, otherwise the loop of the multi-BD communication * case cannot end. @@ -19,7 +32,7 @@ struct hccs_port_info { u8 port_id; u8 port_type; - u8 lane_mode; + u8 max_lane_num; bool enable; /* if the port is enabled */ struct kobject kobj; bool dir_created; @@ -67,13 +80,18 @@ bool has_txdone_irq; }; +#define HCCS_CAPS_HCCS_V2_PM BIT_ULL(0) + struct hccs_dev { struct device *dev; struct acpi_device *acpi_dev; const struct hccs_verspecific_data *verspec_data; + /* device capabilities from firmware, like HCCS_CAPS_xxx. */ u64 caps; u8 chip_num; struct hccs_chip_info *chips; + u16 used_type_num; + struct hccs_type_name_map *type_name_maps; u8 chan_id; struct mutex lock; struct hccs_mbox_client_info cl_info; @@ -91,6 +109,9 @@ HCCS_GET_DIE_PORTS_LANE_STA, HCCS_GET_DIE_PORTS_LINK_STA, HCCS_GET_DIE_PORTS_CRC_ERR_CNT, + HCCS_GET_PORT_IDLE_STATUS, + HCCS_PM_DEC_LANE, + HCCS_PM_INC_LANE, HCCS_SUB_CMD_MAX = 255, }; @@ -113,7 +134,7 @@ struct hccs_port_attr { u8 port_id; u8 port_type; - u8 lane_mode; + u8 max_lane_num; u8 enable : 1; /* if the port is enabled */ u16 rsv2; }; @@ -134,6 +155,14 @@ u8 port_id; }; +#define HCCS_PREPARE_INC_LANE 1 +#define HCCS_GET_ADAPT_RES 2 +#define HCCS_START_RETRAINING 3 +struct hccs_inc_lane_req_param { + u8 port_type; + u8 opt_type; +}; + #define HCCS_PORT_RESET 1 #define HCCS_PORT_SETUP 2 #define HCCS_PORT_CONFIG 3
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
Changed
@@ -1679,6 +1679,7 @@ if (core_vdev->mig_ops) vf_debugfs_exit(hisi_acc_vdev); + hisi_acc_vf_disable_fds(hisi_acc_vdev); iounmap(vf_qm->io_base); vfio_pci_core_close_device(core_vdev); } @@ -1699,6 +1700,7 @@ hisi_acc_vdev->vf_id = pci_iov_vf_id(pdev) + 1; hisi_acc_vdev->pf_qm = pf_qm; hisi_acc_vdev->vf_dev = pdev; + hisi_acc_vdev->vf_qm_state = QM_NOT_READY; mutex_init(&hisi_acc_vdev->state_mutex); core_vdev->migration_flags = VFIO_MIGRATION_STOP_COPY | VFIO_MIGRATION_PRE_COPY;
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/virtio/virtio_mmio.c
Changed
@@ -70,7 +70,7 @@ #include <linux/virtio_config.h> #include <uapi/linux/virtio_mmio.h> #include <linux/virtio_ring.h> - +#include <linux/virtcca_cvm_domain.h> /* The alignment to use between consumer and producer parts of vring. @@ -619,6 +619,7 @@ unsigned long magic; int rc; + enable_swiotlb_for_cvm_dev(&pdev->dev, true); vm_dev = kzalloc(sizeof(*vm_dev), GFP_KERNEL); if (!vm_dev) return -ENOMEM;
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/virtio/virtio_pci_common.c
Changed
@@ -525,6 +525,8 @@ struct virtio_pci_device *vp_dev, *reg_dev = NULL; int rc; + enable_swiotlb_for_cvm_dev(&pci_dev->dev, true); + /* allocate our structure and fill it out */ vp_dev = kzalloc(sizeof(struct virtio_pci_device), GFP_KERNEL); if (!vp_dev)
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/virtio/virtio_pci_common.h
Changed
@@ -29,6 +29,7 @@ #include <linux/virtio_pci_modern.h> #include <linux/highmem.h> #include <linux/spinlock.h> +#include <linux/virtcca_cvm_domain.h> struct virtio_pci_vq_info { /* the actual virtqueue */
View file
_service:recompress:tar_scm:kernel.tar.gz/drivers/virtio/virtio_ring.c
Changed
@@ -13,6 +13,7 @@ #include <linux/dma-mapping.h> #include <linux/kmsan.h> #include <linux/spinlock.h> +#include <linux/virtcca_cvm_domain.h> #include <xen/xen.h> #ifdef DEBUG @@ -294,6 +295,8 @@ if (xen_domain()) return true; + if (virtcca_cvm_domain()) + return true; return false; }
View file
_service:recompress:tar_scm:kernel.tar.gz/include/linux/if_caqm.h
Added
@@ -0,0 +1,485 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* Copyright (c) Huawei Technologies Co., Ltd. 2020-2024 + * All rights reserved. + * + * CAQM An implementation of 802.1 CAQM tagging. + * Authors: + * Chengjun Jia <jiachengjun2@huawei.com> + * Shurui Ding <dongshurui@huawei.com> + */ +#ifndef _LINUX_IF_CAQM_H_ +#define _LINUX_IF_CAQM_H_ + +#include <linux/types.h> +#include <linux/jump_label.h> +#include <linux/skbuff.h> +#include <linux/netdevice.h> +#include <linux/etherdevice.h> +#include <linux/rtnetlink.h> +#include <linux/bug.h> +#include <linux/if_vlan.h> +#include <linux/neighbour.h> + +#ifdef CONFIG_ETH_CAQM + +#define CAQM_HLEN (4) +#define CAQM_MAX_DEPTH (2) +#define CAQM_RECV_EN (true) +#define CAQM_SEND_EN (true) + +#define FIXED_POINT_8 (8U) +#define FIXED_POINT_20 (20U) +#define FIXED_POINT_8_UNIT (1<<8U) +#define FIXED_POINT_20_UNIT (1<<20U) + +extern int sysctl_caqm_cc_type; +extern int sysctl_caqm_debug_info; +extern int sysctl_caqm_alpha_fx_8; +extern int sysctl_caqm_beta; +extern unsigned int sysctl_caqm_min_cwnd; +extern int sysctl_caqm_mtu_unit; +extern int sysctl_caqm_data_hint_unit; +extern unsigned int sysctl_caqm_ack_hint_unit; +extern struct static_key_false sysctl_caqm_enable; +extern u8 sysctl_caqm_en_data; +extern u64 sysctl_caqm_filter_nics; +extern u32 sysctl_caqm_rtt_standard; + +/** + * struct caqm_hdr_info - caqm ethernet header congestion control information + * @cc_type: 3'b000 = CAQM + * @is_last_hop: Location, indicate whether the congestion is the last hop of network + * @padding: Bit11: useless + * @caqm_en: Enable caqm, 0: Disable; 1: Enable + * @c_bit: Congestion status, 0: None-congestion; 1: Congestion + * @i_bit: Hint valid status, 0: Ignore the value of Hint field, see it as 0; 1: Hint is valid + * @hint: carries the CAQM Hint value + */ +struct caqm_hdr_info { +#if defined(__BIG_ENDIAN_BITFIELD) + __u8 cc_type:3, + is_last_hop:1, + padding:1, + caqm_en:1, + c_bit:1, + i_bit:1; +#elif defined(__LITTLE_ENDIAN_BITFIELD) + __u8 i_bit:1, + c_bit:1, + caqm_en:1, + padding:1, + is_last_hop:1, + cc_type:3; +#else +#error "Please fix <asm/byteorder.h>" +#endif + __u8 hint; +} __packed; + +static inline int get_caqm_real_hint_bytes(struct caqm_hdr_info *cinfo) +{ + if (cinfo->cc_type == sysctl_caqm_cc_type) { + if (cinfo->c_bit == cinfo->i_bit) { + return 0; + } else if (cinfo->caqm_en) { + if (cinfo->c_bit) + return 0 - sysctl_caqm_beta; + else + return cinfo->hint * sysctl_caqm_data_hint_unit; + } else if (cinfo->c_bit) { + return (0 - (cinfo->hint)) * sysctl_caqm_ack_hint_unit; + } else { + return cinfo->hint * sysctl_caqm_ack_hint_unit; + } + } + return 0; +} + +/* + * struct caqm_hdr - caqm header + * @h_caqm_info: Congestion Control Information + * @h_caqm_encapsulated_proto: packet type ID or len + */ +struct caqm_hdr { + __be16 h_caqm_info; + __be16 h_caqm_encapsulated_proto; +}; + +#define TCP_CONG_NEEDS_CAQM 0x4 +// #define TCP_CONG_MASK (TCP_CONG_NON_RESTRICTED | TCP_CONG_NEEDS_ECN | TCP_CONG_NEEDS_CAQM) + +/** + * eth_type_caqm - check for valid caqm ether type. + * @ethertype: ether type to check + * + * Returns true if the ether type is a caqm ether type. + */ +static inline bool eth_type_caqm(__be16 ethertype) +{ + return ethertype == htons(CONFIG_ETH_P_CAQM); +} + +#define CAQM_PKT_ACK (0) +#define CAQM_PKT_DATA (1) + +/** + * skb_caqm_info - caqm info in skbuff + * @send_en: true if need send caqm hdr + * @recv_en: true if be recived caqm hdr + * @pkt_type: CAQM_PKT_ACK or CAQM_PTK_DATA + * @send_hdr: the caqm hdr will send out + * @recv_hint: the hint value in recive packet */ +struct skb_caqm_info { + __u16 send_en:1, + recv_en:1, + pkt_type:1; + __u16 send_hdr; + __s32 recv_hint; // unit is Byte +}; + +static inline struct skb_caqm_info *get_skb_caqm_info(struct sk_buff *skb) +{ + return (struct skb_caqm_info *)&(skb->caqm_info); +} + +static inline void set_skb_caqm_info_send_en(struct sk_buff *skb, bool send_en) +{ + struct skb_caqm_info *cinfo = get_skb_caqm_info(skb); + + cinfo->send_en = send_en; +} + +static inline void set_skb_caqm_info_recv_en(struct sk_buff *skb, bool recv_en) +{ + struct skb_caqm_info *cinfo = get_skb_caqm_info(skb); + + cinfo->recv_en = recv_en; +} + +static inline void set_skb_caqm_info_pkt_type(struct sk_buff *skb, bool pkt_type) +{ + struct skb_caqm_info *cinfo = get_skb_caqm_info(skb); + + cinfo->pkt_type = pkt_type; +} + +static inline void set_skb_caqm_info_send_hdr(struct sk_buff *skb, u16 send_hdr) +{ + struct skb_caqm_info *cinfo = get_skb_caqm_info(skb); + + cinfo->send_hdr = send_hdr; +} + +static inline void set_skb_caqm_info_recv_hint(struct sk_buff *skb, __s32 recv_hint) +{ + struct skb_caqm_info *cinfo = get_skb_caqm_info(skb); + + cinfo->recv_hint = recv_hint; +} + +static inline void caqm_set_encap_proto(struct sk_buff *skb, + struct caqm_hdr *chdr) +{ + __be16 proto; + unsigned short *rawp; + + proto = chdr->h_caqm_encapsulated_proto; + if (eth_proto_is_802_3(proto)) { + skb->protocol = proto; + return; + } + + rawp = (unsigned short *)(chdr + 1); + if (*rawp == 0xFFFF) + /* + * This is a magic hack to spot IPX packets. Older Novell + * breaks the protocol design and runs IPX over 802.3 without + * an 802.2 LLC layer. We look for FFFF which isn't a used + * 802.2 SSAP/DSAP. This won't work for fault tolerant netware + * but does for the rest.
View file
_service:recompress:tar_scm:kernel.tar.gz/include/linux/irqchip/arm-gic-v4.h
Changed
@@ -72,10 +72,12 @@ #else } sgi_config16; #endif - atomic_t vmapp_count; }; }; + /* Track the VPE being mapped */ + atomic_t vmapp_count; + /* * Ensures mutual exclusion between affinity setting of the * vPE and vLPI operations using vpe->col_idx.
View file
_service:recompress:tar_scm:kernel.tar.gz/include/linux/perf/arm_pmu.h
Changed
@@ -46,6 +46,18 @@ }, \ } +/* + * Maximum branch record entries which could be processed + * for core perf branch stack sampling support, regardless + * of the hardware support available on a given ARM PMU. + */ +#define MAX_BRANCH_RECORDS 64 + +struct branch_records { + struct perf_branch_stack branch_stack; + struct perf_branch_entry branch_entriesMAX_BRANCH_RECORDS; +}; + /* The events for a given PMU register set. */ struct pmu_hw_events { /* @@ -72,6 +84,17 @@ struct arm_pmu *percpu_pmu; int irq; + + struct branch_records *branches; + + /* Active context for task events */ + void *branch_context; + + /* Active events requesting branch records */ + unsigned int branch_users; + + /* Active branch sample type filters */ + unsigned long branch_sample_type; }; enum armpmu_attr_groups { @@ -102,8 +125,15 @@ void (*stop)(struct arm_pmu *); void (*reset)(void *); int (*map_event)(struct perf_event *event); + void (*sched_task)(struct perf_event_pmu_context *pmu_ctx, bool sched_in); + bool (*branch_stack_init)(struct perf_event *event); + void (*branch_stack_add)(struct perf_event *event, struct pmu_hw_events *cpuc); + void (*branch_stack_del)(struct perf_event *event, struct pmu_hw_events *cpuc); + void (*branch_stack_reset)(void); int num_events; - bool secure_access; /* 32-bit ARM only */ + unsigned int secure_access : 1, /* 32-bit ARM only */ + has_branch_stack: 1, /* 64-bit ARM only */ + reserved : 30; #define ARMV8_PMUV3_MAX_COMMON_EVENTS 0x40 DECLARE_BITMAP(pmceid_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS); #define ARMV8_PMUV3_EXT_COMMON_EVENT_BASE 0x4000 @@ -117,6 +147,11 @@ /* store the PMMIR_EL1 to expose slots */ u64 reg_pmmir; +#ifdef CONFIG_ARM64_BRBE + /* store the BRBIDR0_EL1 capturing attributes */ + u64 reg_brbidr; +#endif + /* Only to be used by ACPI probing code */ unsigned long acpi_cpuid; };
View file
_service:recompress:tar_scm:kernel.tar.gz/include/linux/skbuff.h
Changed
@@ -1051,8 +1051,11 @@ __u16 pad1; #endif ); /* end headers group */ - +#ifdef CONFIG_ETH_CAQM + KABI_USE(1, u64 caqm_info) +#else KABI_RESERVE(1) +#endif KABI_RESERVE(2) KABI_RESERVE(3) KABI_RESERVE(4)
View file
_service:recompress:tar_scm:kernel.tar.gz/include/linux/virtcca_cvm_domain.h
Added
@@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2024. Huawei Technologies Co., Ltd. All rights reserved. + */ + +#ifndef __VIRTCCA_CVM_DOMAIN_H +#define __VIRTCCA_CVM_DOMAIN_H + +#ifdef CONFIG_HISI_VIRTCCA_GUEST + +#include <asm/virtcca_cvm_guest.h> +static inline bool virtcca_cvm_domain(void) +{ + return is_virtcca_cvm_world(); +} + +extern void enable_swiotlb_for_cvm_dev(struct device *dev, bool enable); + +#else +static inline bool virtcca_cvm_domain(void) +{ + return false; +} + +static inline void enable_swiotlb_for_cvm_dev(struct device *dev, bool enable) {} + +#endif + +#endif /* __VIRTCCA_CVM_DOMAIN_H */
View file
_service:recompress:tar_scm:kernel.tar.gz/kernel/sched/topology.c
Changed
@@ -1765,7 +1765,13 @@ ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); if (!ret && write) { if (oldval != sysctl_sched_cluster) { + /* + * Here may have raced with partition_sched_domains_locked, + * it needs to be protected with sched_domains_mutex. + */ + mutex_lock(&sched_domains_mutex); set_sched_cluster(); + mutex_unlock(&sched_domains_mutex); arch_rebuild_cpu_topology(); } }
View file
_service:recompress:tar_scm:kernel.tar.gz/net/Kconfig
Changed
@@ -521,4 +521,24 @@ Accelerating intra-node communication on the data plane of the Terrace service. +config ETH_CAQM + bool "Enable CAQM for Ethernet" + default n + help + Enable CAQM (Congestion-Aware Queue Management) for Ethernet layer. + Unsolved Problem: SYN/SYN-ACK process should be split from payload. + Before solving the problem above, the config should not be opened + for the openeuler_defconfig + +config ETH_P_CAQM + hex "CAQM TPID value" + depends on ETH_CAQM + default "0x8200" + range 1501 0xFFFF + help + The CAQM TPID, i.e. the proto number for 802. Because the value is + not fixed now. We need to set it as a config parameter. + See IANA spec to avoid conflict: + https://www.iana.org/assignments/ieee-802-numbers + endif # if NET
View file
_service:recompress:tar_scm:kernel.tar.gz/net/core/Makefile
Changed
@@ -40,3 +40,4 @@ obj-$(CONFIG_BPF_SYSCALL) += sock_map.o obj-$(CONFIG_BPF_SYSCALL) += bpf_sk_storage.o obj-$(CONFIG_OF) += of_net.o +obj-$(CONFIG_ETH_CAQM) += if_caqm.o \ No newline at end of file
View file
_service:recompress:tar_scm:kernel.tar.gz/net/core/dev.c
Changed
@@ -153,6 +153,7 @@ #include <linux/prandom.h> #include <linux/once_lite.h> #include <net/netdev_rx_queue.h> +#include <linux/if_caqm.h> #include "dev.h" #include "net-sysfs.h" @@ -3397,8 +3398,11 @@ eth = (struct ethhdr *)skb->data; type = eth->h_proto; } - - return vlan_get_protocol_and_depth(skb, type, depth); + type = vlan_get_protocol_and_depth(skb, type, depth); +#ifdef CONFIG_ETH_CAQM + type = caqm_get_protocol_and_depth(skb, type, depth); +#endif + return type; } @@ -5329,6 +5333,9 @@ case htons(ETH_P_IPV6): case htons(ETH_P_8021Q): case htons(ETH_P_8021AD): +#ifdef CONFIG_ETH_CAQM + case htons(CONFIG_ETH_P_CAQM): +#endif return true; default: return false; @@ -5402,6 +5409,14 @@ goto out; } +#ifdef CONFIG_ETH_CAQM + if (eth_type_caqm(skb->protocol)) { + skb = skb_caqm_untag(skb); + if (unlikely(!skb)) + goto out; + } +#endif + if (skb_skip_tc_classify(skb)) goto skip_classify;
View file
_service:recompress:tar_scm:kernel.tar.gz/net/core/flow_dissector.c
Changed
@@ -36,6 +36,7 @@ #include <net/netfilter/nf_conntrack_labels.h> #endif #include <linux/bpf-netns.h> +#include "flow_dissector_caqm.h" static void dissector_set_key(struct flow_dissector *flow_dissector, enum flow_dissector_key_id key_id) @@ -1474,6 +1475,12 @@ nhoff, hlen); break; + #ifdef CONFIG_ETH_CAQM + case htons(CONFIG_ETH_P_CAQM): + fdret = rps_try_skip_caqm_hdr(skb, data, &proto, &nhoff, hlen); + break; + #endif + default: fdret = FLOW_DISSECT_RET_OUT_BAD; break;
View file
_service:recompress:tar_scm:kernel.tar.gz/net/core/flow_dissector_caqm.h
Added
@@ -0,0 +1,34 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* Copyright (c) Huawei Technologies Co., Ltd. 2024-2024 + * All rights reserved. + * + * caqm in the flow dissector, it is to support the rps for a caqm pkt + * Authors: Chengjun Jia <jiachengjun2@huawei.com> + */ +#ifndef _NET_CORE_FLOW_DISSECTOR_CAQM_H +#define _NET_CORE_FLOW_DISSECTOR_CAQM_H + +#ifdef CONFIG_ETH_CAQM +#include <linux/if_caqm.h> +#include <net/flow_dissector.h> +static inline int rps_try_skip_caqm_hdr(const struct sk_buff *skb, const void *data, + __be16 *proto_ptr, int *nhoff_ptr, const int hlen) +{ + const struct caqm_hdr *caqm = NULL; + struct caqm_hdr _caqm; + + if (!static_branch_unlikely(&sysctl_caqm_enable)) + return FLOW_DISSECT_RET_OUT_BAD; + + caqm = __skb_header_pointer(skb, *nhoff_ptr, sizeof(_caqm), + data, hlen, &_caqm); + if (!caqm) + return FLOW_DISSECT_RET_OUT_BAD; + + *proto_ptr = caqm->h_caqm_encapsulated_proto; + *nhoff_ptr += sizeof(*caqm); + return FLOW_DISSECT_RET_PROTO_AGAIN; +} + +#endif +#endif
View file
_service:recompress:tar_scm:kernel.tar.gz/net/core/gro.c
Changed
@@ -3,6 +3,7 @@ #include <net/dst_metadata.h> #include <net/busy_poll.h> #include <trace/events/net.h> +#include <linux/if_caqm.h> #define MAX_GRO_SKBS 8 @@ -129,6 +130,7 @@ lp = NAPI_GRO_CB(p)->last; pinfo = skb_shinfo(lp); + caqm_update_hint_in_gro(skb, p); if (headlen <= offset) { skb_frag_t *frag; skb_frag_t *frag2; @@ -598,6 +600,7 @@ { gro_result_t ret; + skb_gro_caqm_untag(skb); skb_mark_napi_id(skb, napi); trace_napi_gro_receive_entry(skb);
View file
_service:recompress:tar_scm:kernel.tar.gz/net/core/if_caqm.c
Added
@@ -0,0 +1,36 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* Copyright (c) Huawei Technologies Co., Ltd. 2020-2024 + * All rights reserved. + * + * Define the caqm system control parameters. + */ +#include <linux/if_caqm.h> +#include <linux/types.h> + +#ifdef CONFIG_ETH_CAQM +int sysctl_caqm_cc_type __read_mostly; +EXPORT_SYMBOL(sysctl_caqm_cc_type); +int sysctl_caqm_debug_info __read_mostly = 10; +EXPORT_SYMBOL(sysctl_caqm_debug_info); +int sysctl_caqm_alpha_fx_8 __read_mostly = 1 * FIXED_POINT_8; +EXPORT_SYMBOL(sysctl_caqm_alpha_fx_8); +int sysctl_caqm_beta __read_mostly = 512; +EXPORT_SYMBOL(sysctl_caqm_beta); +unsigned int sysctl_caqm_min_cwnd __read_mostly = 1; +EXPORT_SYMBOL(sysctl_caqm_min_cwnd); +int sysctl_caqm_mtu_unit __read_mostly = 1024; +EXPORT_SYMBOL(sysctl_caqm_mtu_unit); +int sysctl_caqm_data_hint_unit __read_mostly = 8; +EXPORT_SYMBOL(sysctl_caqm_data_hint_unit); +unsigned int sysctl_caqm_ack_hint_unit __read_mostly = 64; +EXPORT_SYMBOL(sysctl_caqm_ack_hint_unit); +struct static_key_false sysctl_caqm_enable __read_mostly; +EXPORT_SYMBOL(sysctl_caqm_enable); +u8 sysctl_caqm_en_data; +EXPORT_SYMBOL(sysctl_caqm_en_data); +u64 sysctl_caqm_filter_nics __read_mostly; +EXPORT_SYMBOL(sysctl_caqm_filter_nics); +// tp->srtt_us is 1/8 us, so the default is 200us +u32 sysctl_caqm_rtt_standard __read_mostly = 200 * 8; +EXPORT_SYMBOL(sysctl_caqm_rtt_standard); +#endif
View file
_service:recompress:tar_scm:kernel.tar.gz/net/core/skbuff.c
Changed
@@ -1412,7 +1412,9 @@ #ifdef CONFIG_NET_SCHED CHECK_SKB_FIELD(tc_index); #endif - +#ifdef CONFIG_ETH_CAQM + new->caqm_info = old->caqm_info; +#endif } /*
View file
_service:recompress:tar_scm:kernel.tar.gz/net/core/sysctl_net_caqm.h
Added
@@ -0,0 +1,48 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* Copyright (c) Huawei Technologies Co., Ltd. 2024-2024 + * All rights reserved. + * + * Authors: Chengjun Jia <jiachengjun2@huawei.com> + */ +#ifndef _NET_CORE_SYSCTL_NET_CAQM_H +#define _NET_CORE_SYSCTL_NET_CAQM_H + +#ifdef CONFIG_ETH_CAQM +#include <linux/if_caqm.h> +#include <linux/sysctl.h> + +#define INT16_MAX (32767) + +// cc_type is 3bit, so the max value is 0b'111 +static const unsigned int sysctl_caqm_cc_type_max = 7; +static const unsigned int sysctl_caqm_alpha_fx_8_max = INT16_MAX; +static const unsigned int sysctl_caqm_mtu_unit_min = 64; +static const unsigned int sysctl_caqm_mtu_unit_max = 9000; +static const unsigned int sysctl_caqm_data_hint_unit_max = 1024; +static const unsigned int sysctl_caqm_ack_hint_unit_max = 1024; + +static int proc_caqm_enable(struct ctl_table *table, int write, + void *buffer, size_t *lenp, loff_t *ppos) +{ + int ret; + struct ctl_table tmp = { + .data = &sysctl_caqm_en_data, + .maxlen = sizeof(u8), + .mode = table->mode, + .extra1 = SYSCTL_ZERO, + .extra2 = SYSCTL_ONE, + }; + + ret = proc_dou8vec_minmax(&tmp, write, buffer, lenp, ppos); + + if (write) { + if (sysctl_caqm_en_data) + static_branch_enable(&sysctl_caqm_enable); + else + static_branch_disable(&sysctl_caqm_enable); + } + + return ret; +} +#endif +#endif /* _NET_CORE_SYSCTL_NET_CAQM_H */
View file
_service:recompress:tar_scm:kernel.tar.gz/net/core/sysctl_net_core.c
Changed
@@ -25,6 +25,7 @@ #include <net/pkt_sched.h> #include "dev.h" +#include "sysctl_net_caqm.h" static int int_3600 = 3600; static int min_sndbuf = SOCK_MIN_SNDBUF; @@ -581,6 +582,102 @@ .proc_handler = set_default_qdisc }, #endif +#ifdef CONFIG_ETH_CAQM + { + .procname = "sysctl_caqm_cc_type", + .data = &sysctl_caqm_cc_type, + .mode = 0644, + .maxlen = sizeof(int), + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ZERO, + .extra2 = (void *)&sysctl_caqm_cc_type_max, + }, + { + .procname = "sysctl_caqm_debug_info", + .data = &sysctl_caqm_debug_info, + .mode = 0644, + .maxlen = sizeof(int), + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ZERO, + .extra2 = SYSCTL_INT_MAX, + }, + { + .procname = "sysctl_caqm_alpha_fx_8", + .data = &sysctl_caqm_alpha_fx_8, + .mode = 0644, + .maxlen = sizeof(int), + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ONE, + .extra2 = (void *)&sysctl_caqm_alpha_fx_8_max, + }, + { + .procname = "sysctl_caqm_beta", + .data = &sysctl_caqm_beta, + .mode = 0644, + .maxlen = sizeof(int), + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ONE, + .extra2 = SYSCTL_INT_MAX, + }, + { + .procname = "sysctl_caqm_min_cwnd", + .data = &sysctl_caqm_min_cwnd, + .mode = 0644, + .maxlen = sizeof(unsigned int), + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ONE, + .extra2 = SYSCTL_INT_MAX, + }, + { + .procname = "sysctl_caqm_mtu_unit", + .data = &sysctl_caqm_mtu_unit, + .mode = 0644, + .maxlen = sizeof(int), + .proc_handler = proc_dointvec_minmax, + .extra1 = (void *)&sysctl_caqm_mtu_unit_min, + .extra2 = (void *)&sysctl_caqm_mtu_unit_max, + }, + { + .procname = "sysctl_caqm_data_hint_unit", + .data = &sysctl_caqm_data_hint_unit, + .mode = 0644, + .maxlen = sizeof(int), + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ONE, + .extra2 = (void *)&sysctl_caqm_data_hint_unit_max, + }, + { + .procname = "sysctl_caqm_ack_hint_unit", + .data = &sysctl_caqm_ack_hint_unit, + .mode = 0644, + .maxlen = sizeof(unsigned int), + .proc_handler = proc_douintvec_minmax, + .extra1 = SYSCTL_ONE, + .extra2 = (void *)&sysctl_caqm_ack_hint_unit_max, + }, + { + .procname = "sysctl_caqm_enable", + .data = &sysctl_caqm_en_data, + .mode = 0644, + .maxlen = sizeof(u8), + .proc_handler = proc_caqm_enable, + }, + { + .procname = "sysctl_caqm_filter_nics", + .data = &sysctl_caqm_filter_nics, + .mode = 0644, + .maxlen = sizeof(unsigned long), + .proc_handler = proc_dointvec_minmax, + }, + { + .procname = "sysctl_caqm_rtt_standard", + .data = &sysctl_caqm_rtt_standard, + .mode = 0644, + .maxlen = sizeof(unsigned int), + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ONE, + }, +#endif { .procname = "netdev_budget", .data = &netdev_budget,
View file
_service:recompress:tar_scm:kernel.tar.gz/net/ethernet/eth.c
Changed
@@ -62,6 +62,7 @@ #include <net/gro.h> #include <linux/uaccess.h> #include <net/pkt_sched.h> +#include <linux/if_caqm.h> /** * eth_header - create the Ethernet header @@ -80,7 +81,11 @@ unsigned short type, const void *daddr, const void *saddr, unsigned int len) { - struct ethhdr *eth = skb_push(skb, ETH_HLEN); + struct ethhdr *eth; + #ifdef CONFIG_ETH_CAQM + caqm_add_eth_header(skb, &type, dev); + #endif + eth = skb_push(skb, ETH_HLEN); if (type != ETH_P_802_3 && type != ETH_P_802_2) eth->h_proto = htons(type);
View file
_service:recompress:tar_scm:kernel.tar.gz/net/ipv4/Kconfig
Changed
@@ -647,6 +647,21 @@ For further details see: http://simula.stanford.edu/~alizade/Site/DCTCP_files/dctcp-final.pdf +config TCP_CONG_CAQM + tristate "TCP Congestion-Aware Queue Management (CAQM)" + depends on ETH_CAQM + default n + help + Enable CAQM (Congestion-Aware Queue Management) need switch and kernel + support. + - switch need support caqm ethernet transport. + - kernel need support caqm ethernet trasport. + The algorithm is a credit-based congestion design which is inspired by + XCP. The increase/decrease of cwnd is decided by the network device + instead of the endhost. + For further details of XCP, see: + https://dl.acm.org/doi/pdf/10.1145/633025.633035 + config TCP_CONG_CDG tristate "CAIA Delay-Gradient (CDG)" default n @@ -709,6 +724,9 @@ config DEFAULT_DCTCP bool "DCTCP" if TCP_CONG_DCTCP=y + config DEFAULT_CAQM + bool "CAQM" if TCP_CONG_CAQM=y + config DEFAULT_CDG bool "CDG" if TCP_CONG_CDG=y @@ -737,6 +755,7 @@ default "veno" if DEFAULT_VENO default "reno" if DEFAULT_RENO default "dctcp" if DEFAULT_DCTCP + default "caqm" if DEFAULT_CAQM default "cdg" if DEFAULT_CDG default "bbr" if DEFAULT_BBR default "cubic"
View file
_service:recompress:tar_scm:kernel.tar.gz/net/ipv4/Makefile
Changed
@@ -51,6 +51,7 @@ obj-$(CONFIG_TCP_CONG_CDG) += tcp_cdg.o obj-$(CONFIG_TCP_CONG_CUBIC) += tcp_cubic.o obj-$(CONFIG_TCP_CONG_DCTCP) += tcp_dctcp.o +obj-$(CONFIG_TCP_CONG_CAQM) += tcp_caqm.o obj-$(CONFIG_TCP_CONG_WESTWOOD) += tcp_westwood.o obj-$(CONFIG_TCP_CONG_HSTCP) += tcp_highspeed.o obj-$(CONFIG_TCP_CONG_HYBLA) += tcp_hybla.o
View file
_service:recompress:tar_scm:kernel.tar.gz/net/ipv4/ip_output.c
Changed
@@ -83,6 +83,7 @@ #include <linux/netfilter_bridge.h> #include <linux/netlink.h> #include <linux/tcp.h> +#include <linux/if_caqm.h> static int ip_fragment(struct net *net, struct sock *sk, struct sk_buff *skb, @@ -232,6 +233,10 @@ sock_confirm_neigh(skb, neigh); /* if crossing protocols, can not use the cached header */ +#ifdef CONFIG_ETH_CAQM + /* if caqm, can not use the cached header */ + is_v6gw = is_v6gw | is_caqm_out_enable(skb, dev); +#endif res = neigh_output(neigh, skb, is_v6gw); rcu_read_unlock(); return res;
View file
_service:recompress:tar_scm:kernel.tar.gz/net/ipv4/tcp_caqm.c
Added
@@ -0,0 +1,134 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* Copyright (c) Huawei Technologies Co., Ltd. 2020-2024 + * All rights reserved. + * + * DataCenter TCP with CAQM (Confined Active Queue Management). + * enable needs specific switch support + * + * Authors: + * Chengjun Jia <jiachengjun2@huawei.com> + * Shurui Ding <dongshurui@huawei.com> + */ + +#include <linux/module.h> +#include <linux/mm.h> +#include <net/tcp.h> +#include <linux/inet_diag.h> +#include "tcp_caqm.h" + +static struct tcp_congestion_ops caqm_reno; + +static size_t caqm_get_info(struct sock *sk, u32 ext, int *attr, union tcp_cc_info *info) +{ + return 0; +} + +static void caqm_init(struct sock *sk) +{ + struct tcp_sock *tp = tcp_sk(sk); + struct caqm_ca *ca = inet_csk_ca(sk); + + if ((tp->ecn_flags & TCP_CAQM_OK) == TCP_CAQM_OK) { + init_caqm_ca(ca); + return; + } + + /* No CAQM support Fall back to Reno. + * checkout clear work, see tcp_dctcp.c:95 + */ + inet_csk(sk)->icsk_ca_ops = &caqm_reno; +} + +static u32 tcp_caqm_ssthresh(struct sock *sk) +{ + struct caqm_ca *ca = inet_csk_ca(sk); + struct tcp_sock *tp = tcp_sk(sk); + + if (!ca->caqm_ca_enable) + return tcp_reno_ssthresh(sk); + ca->loss_cwnd = tp->snd_cwnd; + // reno: 1/2*snd_cwnd, dctcp: (1-alpha/2)*snd_cwnd; caqm: keep it + return max(tp->snd_cwnd, caqm_para_min_cwnd); +} + +static void tcp_caqm_cong_avoid(struct sock *sk, u32 ack, u32 acked) +{ + struct caqm_ca *ca = inet_csk_ca(sk); + + if (!ca->caqm_ca_enable) { + tcp_reno_cong_avoid(sk, ack, acked); + return; + } +} + +static void tcp_caqm_react_to_loss(struct sock *sk) +{ + struct caqm_ca *ca = inet_csk_ca(sk); + struct tcp_sock *tp = tcp_sk(sk); + + ca->loss_cwnd = tp->snd_cwnd; + tp->snd_ssthresh = max(tp->snd_cwnd >> 1U, 2U); +} + +static void tcp_caqm_state(struct sock *sk, u8 new_state) +{ + if (new_state == TCP_CA_Recovery && + new_state != inet_csk(sk)->icsk_ca_state) + tcp_caqm_react_to_loss(sk); + /* We handle RTO in tcp_caqm_cwnd_event to ensure that we perform only + * one loss-adjustment per RTT. + */ +} + +static u32 tcp_caqm_undo_cwnd(struct sock *sk) +{ + struct caqm_ca *ca = inet_csk_ca(sk); + + if (!ca->caqm_ca_enable) + return tcp_reno_undo_cwnd(sk); + // Update 7.22: reno_undo_cwnd can not keep the cwnd, so keep it + return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); +} + +/* alg works as reno by default. + * Only after syn--syn-ack to enable, the alg changes to caqm.*/ +static struct tcp_congestion_ops caqm __read_mostly = { + .init = caqm_init, + .ssthresh = tcp_caqm_ssthresh, + .cong_avoid = tcp_caqm_cong_avoid, + .undo_cwnd = tcp_caqm_undo_cwnd, + .set_state = tcp_caqm_state, + .get_info = caqm_get_info, + .flags = TCP_CONG_NEEDS_CAQM, + .owner = THIS_MODULE, + .name = "caqm", +}; + +static struct tcp_congestion_ops caqm_reno __read_mostly = { + .ssthresh = tcp_reno_ssthresh, + .cong_avoid = tcp_reno_cong_avoid, + .undo_cwnd = tcp_reno_undo_cwnd, + .get_info = caqm_get_info, + .owner = THIS_MODULE, + .name = "caqm_reno", +}; + +static int __init caqm_register(void) +{ + BUILD_BUG_ON(sizeof(struct caqm_ca) > ICSK_CA_PRIV_SIZE); + return tcp_register_congestion_control(&caqm); +} + +static void __exit caqm_unregister(void) +{ + tcp_unregister_congestion_control(&caqm); +} + +module_init(caqm_register); +module_exit(caqm_unregister); + +MODULE_AUTHOR("Chengjun Jia <jiachengjun2@huawei.com>"); +MODULE_AUTHOR("Shurui Dong <dongshurui@huawei.com>"); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("TCP w Confined Active Queue Management(CAQM)");
View file
_service:recompress:tar_scm:kernel.tar.gz/net/ipv4/tcp_caqm.h
Added
@@ -0,0 +1,320 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* Copyright (c) Huawei Technologies Co., Ltd. 2020-2024 + * All rights reserved. + * + * Define the caqm header file + */ +#ifndef _TCP_CAQM_H +#define _TCP_CAQM_H +#ifdef CONFIG_ETH_CAQM + +#include <linux/if_caqm.h> +#include <linux/types.h> +#include <linux/math64.h> +#include <linux/tcp.h> + +/* CAQM parameter */ +#define CAQM_ALPHA_SHIFT (3U) +#define CAQM_MTU_SIZE (sysctl_caqm_mtu_unit) +#define CAQM_UNIT_VALUE (sysctl_caqm_data_hint_unit) +#define CAQM_ACK_UNIT_VALUE (sysctl_caqm_ack_hint_unit) + +// parameter alpha for generating hint value, hint = (caqm_para_alpha >> 3) * MTU / cw * MTU +#define caqm_para_alpha (sysctl_caqm_alpha_fx_8) +// parameter beta for updating caqm_cwd, cwd -= beta when 'C=1' is recved +#define caqm_para_beta (sysctl_caqm_beta) +// parameter for the minimum caqm_cwd +#define caqm_para_min_cwnd (sysctl_caqm_min_cwnd) + +#define MAX_HINT_VAL (0xFF) + +/* caqm_flags in tcp_sock alias ecn_flags(already used 1,2,4,8) */ +#define TCP_CAQM_SRV (16) +#define TCP_CAQM_CLI (32) +#define TCP_CAQM_OK (TCP_CAQM_SRV | TCP_CAQM_CLI) + +#define TCP_EXFLAGS_CLI_CAQM (1) +#define TCP_EXFLAGS_SRV_CAQM (2) + +/* CAQM Alg State */ +enum CaqmState { + CAQM_STATE_START = 1, // Slow_Start + CAQM_STATE_CONG, // Cong_Avoid +}; + +/* CAQM Alg context */ +struct caqm_ca { + u16 caqm_ca_enable; + u16 sender_state; + s64 cw_to_back; // delta cwnd to feedback + s64 left_hint_sum; // left hint, Unit is 1/2^12 + int totalCwndAdjust; + u32 loss_cwnd; +}; + +static inline void init_caqm_ca(struct caqm_ca *caqm_ca) +{ + caqm_ca->caqm_ca_enable = 0; + caqm_ca->sender_state = CAQM_STATE_START; + caqm_ca->cw_to_back = 0; + caqm_ca->loss_cwnd = 0; + caqm_ca->left_hint_sum = 0; + caqm_ca->totalCwndAdjust = 0; + // tp->ecn_flags can not be initialized here, + // require the original tcp code to set it as 0. +} + +static inline bool tcp_caqm_is_cwnd_limited(const struct sock *sk) +{ + const struct tcp_sock *tp = tcp_sk(sk); + const struct caqm_ca *caqm_ca = inet_csk_ca(sk); + + if (tp->is_cwnd_limited) + return true; + + if (caqm_ca->sender_state == CAQM_STATE_START) + return tp->snd_cwnd < 2 * tp->max_packets_out; + + return false; +} + +/* Step 1:get the hint value from caqm */ +static inline u8 get_data_to_set_hint(struct sock *sk, const u32 tcp_cwnd) +{ + struct caqm_ca *caqm_ca = inet_csk_ca(sk); + struct tcp_sock *tp = tcp_sk(sk); + + // Attention need MTU_SIZE%sysctl_caqm_data_hint_unit == 0 + u8 hint = CAQM_MTU_SIZE / sysctl_caqm_data_hint_unit; + + if (caqm_ca->sender_state == CAQM_STATE_CONG) { + u32 rtt = sysctl_caqm_rtt_standard; + + if (tp->srtt_us > sysctl_caqm_rtt_standard / 4 && + tp->srtt_us < sysctl_caqm_rtt_standard * 4) { + rtt = tp->srtt_us; + } + u32 temp = (caqm_para_alpha * rtt * CAQM_MTU_SIZE) / + (tcp_cwnd * sysctl_caqm_data_hint_unit * sysctl_caqm_rtt_standard); + temp = temp >> CAQM_ALPHA_SHIFT; + + if (temp >= MAX_HINT_VAL) { + hint = MAX_HINT_VAL; + return hint; + } + hint = temp; + u64 tmp1 = (((u64)rtt * CAQM_MTU_SIZE) << FIXED_POINT_20); + u32 tmp2 = tcp_cwnd * sysctl_caqm_data_hint_unit * sysctl_caqm_rtt_standard; + s64 hint_last; + + tmp1 *= caqm_para_alpha; + tmp1 = tmp1 >> CAQM_ALPHA_SHIFT; + do_div(tmp1, tmp2); + hint_last = (s64)tmp1 - ((s64)hint << FIXED_POINT_20); + caqm_ca->left_hint_sum += hint_last; + if (caqm_ca->left_hint_sum >= FIXED_POINT_20_UNIT) { + caqm_ca->left_hint_sum -= FIXED_POINT_20_UNIT; + hint += 1U; + } + } + return hint; +} + +/* fill the data packet's caqm_hdr_info */ +static inline void build_data_hdr(u8 hint, struct caqm_hdr_info *pkt) +{ + *((u16 *)pkt) = 0; + pkt->cc_type = sysctl_caqm_cc_type; + pkt->caqm_en = 1; + pkt->c_bit = 0; + pkt->i_bit = 1; + pkt->hint = hint; +} + +static inline void set_data_caqm_hdr(struct sk_buff *skb, struct sock *sk, const u32 tcp_cwnd) +{ + u8 data_hint = get_data_to_set_hint(sk, tcp_cwnd); + struct skb_caqm_info *cinfo = get_skb_caqm_info(skb); + + build_data_hdr(data_hint, (struct caqm_hdr_info *)&cinfo->send_hdr); +} + +/* Step 3: fill ACK and feedback */ +/* Hint upper bound for one ACK */ +#define MAX_WIN_DELTA_VAL (MAX_HINT_VAL * (int)sysctl_caqm_ack_hint_unit) +#define MIN_WIN_DELTA_VAL (0 - MAX_WIN_DELTA_VAL) +static inline int get_ack_back_hint(const struct caqm_ca *caqm_ca) +{ + if (caqm_ca->cw_to_back >= MAX_WIN_DELTA_VAL) + return MAX_HINT_VAL; + else if (caqm_ca->cw_to_back <= MIN_WIN_DELTA_VAL) + return 0 - MAX_HINT_VAL; + u32 cw_to_back_sign = caqm_ca->cw_to_back < 0 ? 1 : 0; + u64 tmp_val = cw_to_back_sign ? (0 - caqm_ca->cw_to_back) : (caqm_ca->cw_to_back); + + do_div(tmp_val, sysctl_caqm_ack_hint_unit); + return cw_to_back_sign ? (0 - (s64)tmp_val) : tmp_val; +} + +/* get the ack to back: fill it */ +static inline void build_ack_hdr(int hint, struct caqm_hdr_info *pkt) +{ + *((u16 *)pkt) = 0; + pkt->caqm_en = 0; + pkt->cc_type = sysctl_caqm_cc_type; + if (hint >= 0) { + pkt->c_bit = 0; + pkt->i_bit = 1; + pkt->hint = hint; + } else {// hint < 0, need down speed + pkt->c_bit = 1; + pkt->i_bit = 0; + pkt->hint = 0 - hint; + } +} + +/* set the ack and update cw_to_back */ +static inline void set_ack_caqm_hdr(struct caqm_ca *caqm_ca, struct caqm_hdr_info *ack_hdr_info) +{ + int ack_hint = get_ack_back_hint(caqm_ca); + + build_ack_hdr(ack_hint, ack_hdr_info); + caqm_ca->cw_to_back -= (s64)ack_hint * sysctl_caqm_ack_hint_unit; +} + +/* Step 4: recv ACK, update cwnd */ +static inline void update_caqm_state(struct caqm_ca *caqm_ca, struct sk_buff *skb) +{ + struct skb_caqm_info *cinfo = get_skb_caqm_info(skb); + // if (cinfo->caqm_en || (cinfo->c_bit && cinfo->i_bit)) { + if (cinfo->pkt_type == CAQM_PKT_ACK && cinfo->recv_hint <= 0 && + caqm_ca->sender_state == CAQM_STATE_START) + caqm_ca->sender_state = CAQM_STATE_CONG; +} + +// For tcp_input.c +static inline bool tcp_ca_needs_caqm(const struct sock *sk) +{ + const struct inet_connection_sock *icsk = inet_csk(sk); +
View file
_service:recompress:tar_scm:kernel.tar.gz/net/ipv4/tcp_input.c
Changed
@@ -80,6 +80,7 @@ #include <linux/jump_label_ratelimit.h> #include <net/busy_poll.h> #include <net/mptcp.h> +#include "tcp_caqm.h" int sysctl_tcp_max_orphans __read_mostly = NR_FILE; @@ -6006,6 +6007,7 @@ struct tcp_sock *tp = tcp_sk(sk); unsigned int len = skb->len; + try_to_recv_pkt_w_caqm(sk, skb); /* TCP congestion window tracking */ trace_tcp_probe(sk, skb); @@ -6628,6 +6630,7 @@ bool acceptable; SKB_DR(reason); + try_to_recv_pkt_w_caqm(sk, skb); switch (sk->sk_state) { case TCP_CLOSE: SKB_DR_SET(reason, TCP_CLOSE);
View file
_service:recompress:tar_scm:kernel.tar.gz/net/ipv4/tcp_minisocks.c
Changed
@@ -22,6 +22,7 @@ #include <net/tcp.h> #include <net/xfrm.h> #include <net/busy_poll.h> +#include "tcp_caqm.h" static bool tcp_in_window(u32 seq, u32 end_seq, u32 s_win, u32 e_win) { @@ -873,6 +874,7 @@ int ret = 0; int state = child->sk_state; + tcp_copy_ecn_flags(parent, child); /* record sk_napi_id and sk_rx_queue_mapping of child. */ sk_mark_napi_id_set(child, skb);
View file
_service:recompress:tar_scm:kernel.tar.gz/net/ipv4/tcp_output.c
Changed
@@ -46,6 +46,7 @@ #include <linux/static_key.h> #include <trace/events/tcp.h> +#include "tcp_caqm.h" /* Refresh clocks of a TCP socket, * ensuring monotically increasing values. @@ -1308,8 +1309,10 @@ struct tcphdr *th; u64 prior_wstamp; int err; + u8 tcp_hdr_rsrvd_4b; BUG_ON(!skb || !tcp_skb_pcount(skb)); + tcp_hdr_rsrvd_4b = try_to_update_skb_for_caqm(sk, skb); tp = tcp_sk(sk); prior_wstamp = tp->tcp_wstamp_ns; tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache); @@ -1395,6 +1398,10 @@ th->ack_seq = htonl(rcv_nxt); *(((__be16 *)th) + 6) = htons(((tcp_header_size >> 2) << 12) | tcb->tcp_flags); +#ifdef CONFIG_ETH_CAQM + if (static_branch_unlikely(&sysctl_caqm_enable)) + *(((__be16 *)th) + 6) |= htons((tcp_hdr_rsrvd_4b & 0x0F) << 8); +#endif th->check = 0; th->urg_ptr = 0; @@ -1756,6 +1763,9 @@ It is MMS_S - sizeof(tcphdr) of rfc1122 */ mss_now = pmtu - icsk->icsk_af_ops->net_header_len - sizeof(struct tcphdr); + #ifdef CONFIG_ETH_CAQM + mss_now -= caqm_leave_room_size(sk); // leave room for caqm + #endif /* IPv6 adds a frag_hdr in case RTAX_FEATURE_ALLFRAG is set */ if (icsk->icsk_af_ops->net_frag_header_len) { @@ -3775,6 +3785,7 @@ th->window = htons(min(req->rsk_rcv_wnd, 65535U)); tcp_options_write(th, NULL, &opts); th->doff = (tcp_header_size >> 2); + tcp_caqm_make_synack(sk, th); TCP_INC_STATS(sock_net(sk), TCP_MIB_OUTSEGS); #ifdef CONFIG_TCP_MD5SIG
View file
_service:recompress:tar_scm:kernel.tar.gz/net/mptcp/fastopen.c
Changed
@@ -68,12 +68,12 @@ skb = skb_peek_tail(&sk->sk_receive_queue); if (skb) { WARN_ON_ONCE(MPTCP_SKB_CB(skb)->end_seq); - pr_debug("msk %p moving seq %llx -> %llx end_seq %llx -> %llx", sk, + pr_debug("msk %p moving seq %llx -> %llx end_seq %llx -> %llx\n", sk, MPTCP_SKB_CB(skb)->map_seq, MPTCP_SKB_CB(skb)->map_seq + msk->ack_seq, MPTCP_SKB_CB(skb)->end_seq, MPTCP_SKB_CB(skb)->end_seq + msk->ack_seq); MPTCP_SKB_CB(skb)->map_seq += msk->ack_seq; MPTCP_SKB_CB(skb)->end_seq += msk->ack_seq; } - pr_debug("msk=%p ack_seq=%llx", msk, msk->ack_seq); + pr_debug("msk=%p ack_seq=%llx\n", msk, msk->ack_seq); }
View file
_service:recompress:tar_scm:kernel.tar.gz/net/mptcp/options.c
Changed
@@ -117,7 +117,7 @@ mp_opt->suboptions |= OPTION_MPTCP_CSUMREQD; ptr += 2; } - pr_debug("MP_CAPABLE version=%x, flags=%x, optlen=%d sndr=%llu, rcvr=%llu len=%d csum=%u", + pr_debug("MP_CAPABLE version=%x, flags=%x, optlen=%d sndr=%llu, rcvr=%llu len=%d csum=%u\n", version, flags, opsize, mp_opt->sndr_key, mp_opt->rcvr_key, mp_opt->data_len, mp_opt->csum); break; @@ -131,7 +131,7 @@ ptr += 4; mp_opt->nonce = get_unaligned_be32(ptr); ptr += 4; - pr_debug("MP_JOIN bkup=%u, id=%u, token=%u, nonce=%u", + pr_debug("MP_JOIN bkup=%u, id=%u, token=%u, nonce=%u\n", mp_opt->backup, mp_opt->join_id, mp_opt->token, mp_opt->nonce); } else if (opsize == TCPOLEN_MPTCP_MPJ_SYNACK) { @@ -142,19 +142,19 @@ ptr += 8; mp_opt->nonce = get_unaligned_be32(ptr); ptr += 4; - pr_debug("MP_JOIN bkup=%u, id=%u, thmac=%llu, nonce=%u", + pr_debug("MP_JOIN bkup=%u, id=%u, thmac=%llu, nonce=%u\n", mp_opt->backup, mp_opt->join_id, mp_opt->thmac, mp_opt->nonce); } else if (opsize == TCPOLEN_MPTCP_MPJ_ACK) { mp_opt->suboptions |= OPTION_MPTCP_MPJ_ACK; ptr += 2; memcpy(mp_opt->hmac, ptr, MPTCPOPT_HMAC_LEN); - pr_debug("MP_JOIN hmac"); + pr_debug("MP_JOIN hmac\n"); } break; case MPTCPOPT_DSS: - pr_debug("DSS"); + pr_debug("DSS\n"); ptr++; /* we must clear 'mpc_map' be able to detect MP_CAPABLE @@ -169,7 +169,7 @@ mp_opt->ack64 = (flags & MPTCP_DSS_ACK64) != 0; mp_opt->use_ack = (flags & MPTCP_DSS_HAS_ACK); - pr_debug("data_fin=%d dsn64=%d use_map=%d ack64=%d use_ack=%d", + pr_debug("data_fin=%d dsn64=%d use_map=%d ack64=%d use_ack=%d\n", mp_opt->data_fin, mp_opt->dsn64, mp_opt->use_map, mp_opt->ack64, mp_opt->use_ack); @@ -207,7 +207,7 @@ ptr += 4; } - pr_debug("data_ack=%llu", mp_opt->data_ack); + pr_debug("data_ack=%llu\n", mp_opt->data_ack); } if (mp_opt->use_map) { @@ -231,7 +231,7 @@ ptr += 2; } - pr_debug("data_seq=%llu subflow_seq=%u data_len=%u csum=%d:%u", + pr_debug("data_seq=%llu subflow_seq=%u data_len=%u csum=%d:%u\n", mp_opt->data_seq, mp_opt->subflow_seq, mp_opt->data_len, !!(mp_opt->suboptions & OPTION_MPTCP_CSUMREQD), mp_opt->csum); @@ -293,7 +293,7 @@ mp_opt->ahmac = get_unaligned_be64(ptr); ptr += 8; } - pr_debug("ADD_ADDR%s: id=%d, ahmac=%llu, echo=%d, port=%d", + pr_debug("ADD_ADDR%s: id=%d, ahmac=%llu, echo=%d, port=%d\n", (mp_opt->addr.family == AF_INET6) ? "6" : "", mp_opt->addr.id, mp_opt->ahmac, mp_opt->echo, ntohs(mp_opt->addr.port)); break; @@ -309,7 +309,7 @@ mp_opt->rm_list.nr = opsize - TCPOLEN_MPTCP_RM_ADDR_BASE; for (i = 0; i < mp_opt->rm_list.nr; i++) mp_opt->rm_list.idsi = *ptr++; - pr_debug("RM_ADDR: rm_list_nr=%d", mp_opt->rm_list.nr); + pr_debug("RM_ADDR: rm_list_nr=%d\n", mp_opt->rm_list.nr); break; case MPTCPOPT_MP_PRIO: @@ -318,7 +318,7 @@ mp_opt->suboptions |= OPTION_MPTCP_PRIO; mp_opt->backup = *ptr++ & MPTCP_PRIO_BKUP; - pr_debug("MP_PRIO: prio=%d", mp_opt->backup); + pr_debug("MP_PRIO: prio=%d\n", mp_opt->backup); break; case MPTCPOPT_MP_FASTCLOSE: @@ -329,7 +329,7 @@ mp_opt->rcvr_key = get_unaligned_be64(ptr); ptr += 8; mp_opt->suboptions |= OPTION_MPTCP_FASTCLOSE; - pr_debug("MP_FASTCLOSE: recv_key=%llu", mp_opt->rcvr_key); + pr_debug("MP_FASTCLOSE: recv_key=%llu\n", mp_opt->rcvr_key); break; case MPTCPOPT_RST: @@ -343,7 +343,7 @@ flags = *ptr++; mp_opt->reset_transient = flags & MPTCP_RST_TRANSIENT; mp_opt->reset_reason = *ptr; - pr_debug("MP_RST: transient=%u reason=%u", + pr_debug("MP_RST: transient=%u reason=%u\n", mp_opt->reset_transient, mp_opt->reset_reason); break; @@ -354,7 +354,7 @@ ptr += 2; mp_opt->suboptions |= OPTION_MPTCP_FAIL; mp_opt->fail_seq = get_unaligned_be64(ptr); - pr_debug("MP_FAIL: data_seq=%llu", mp_opt->fail_seq); + pr_debug("MP_FAIL: data_seq=%llu\n", mp_opt->fail_seq); break; default: @@ -417,7 +417,7 @@ *size = TCPOLEN_MPTCP_MPC_SYN; return true; } else if (subflow->request_join) { - pr_debug("remote_token=%u, nonce=%u", subflow->remote_token, + pr_debug("remote_token=%u, nonce=%u\n", subflow->remote_token, subflow->local_nonce); opts->suboptions = OPTION_MPTCP_MPJ_SYN; opts->join_id = subflow->local_id; @@ -500,7 +500,7 @@ *size = TCPOLEN_MPTCP_MPC_ACK; } - pr_debug("subflow=%p, local_key=%llu, remote_key=%llu map_len=%d", + pr_debug("subflow=%p, local_key=%llu, remote_key=%llu map_len=%d\n", subflow, subflow->local_key, subflow->remote_key, data_len); @@ -509,7 +509,7 @@ opts->suboptions = OPTION_MPTCP_MPJ_ACK; memcpy(opts->hmac, subflow->hmac, MPTCPOPT_HMAC_LEN); *size = TCPOLEN_MPTCP_MPJ_ACK; - pr_debug("subflow=%p", subflow); + pr_debug("subflow=%p\n", subflow); /* we can use the full delegate action helper only from BH context * If we are in process context - sk is flushing the backlog at @@ -675,7 +675,7 @@ *size = len; if (drop_other_suboptions) { - pr_debug("drop other suboptions"); + pr_debug("drop other suboptions\n"); opts->suboptions = 0; /* note that e.g. DSS could have written into the memory @@ -695,7 +695,7 @@ } else { MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_ECHOADDTX); } - pr_debug("addr_id=%d, ahmac=%llu, echo=%d, port=%d", + pr_debug("addr_id=%d, ahmac=%llu, echo=%d, port=%d\n", opts->addr.id, opts->ahmac, echo, ntohs(opts->addr.port)); return true; @@ -726,7 +726,7 @@ opts->rm_list = rm_list; for (i = 0; i < opts->rm_list.nr; i++) - pr_debug("rm_list_ids%d=%d", i, opts->rm_list.idsi); + pr_debug("rm_list_ids%d=%d\n", i, opts->rm_list.idsi); MPTCP_ADD_STATS(sock_net(sk), MPTCP_MIB_RMADDRTX, opts->rm_list.nr); return true; } @@ -752,7 +752,7 @@ opts->suboptions |= OPTION_MPTCP_PRIO; opts->backup = subflow->request_bkup; - pr_debug("prio=%d", opts->backup); + pr_debug("prio=%d\n", opts->backup); return true; } @@ -794,7 +794,7 @@ opts->suboptions |= OPTION_MPTCP_FASTCLOSE; opts->rcvr_key = READ_ONCE(msk->remote_key); - pr_debug("FASTCLOSE key=%llu", opts->rcvr_key); + pr_debug("FASTCLOSE key=%llu\n", opts->rcvr_key); MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_MPFASTCLOSETX); return true; } @@ -816,7 +816,7 @@ opts->suboptions |= OPTION_MPTCP_FAIL; opts->fail_seq = subflow->map_seq; - pr_debug("MP_FAIL fail_seq=%llu", opts->fail_seq); + pr_debug("MP_FAIL fail_seq=%llu\n", opts->fail_seq);
View file
_service:recompress:tar_scm:kernel.tar.gz/net/mptcp/pm.c
Changed
@@ -19,7 +19,7 @@ { u8 add_addr = READ_ONCE(msk->pm.addr_signal); - pr_debug("msk=%p, local_id=%d, echo=%d", msk, addr->id, echo); + pr_debug("msk=%p, local_id=%d, echo=%d\n", msk, addr->id, echo); lockdep_assert_held(&msk->pm.lock); @@ -45,7 +45,7 @@ { u8 rm_addr = READ_ONCE(msk->pm.addr_signal); - pr_debug("msk=%p, rm_list_nr=%d", msk, rm_list->nr); + pr_debug("msk=%p, rm_list_nr=%d\n", msk, rm_list->nr); if (rm_addr) { MPTCP_ADD_STATS(sock_net((struct sock *)msk), @@ -66,7 +66,7 @@ { struct mptcp_pm_data *pm = &msk->pm; - pr_debug("msk=%p, token=%u side=%d", msk, READ_ONCE(msk->token), server_side); + pr_debug("msk=%p, token=%u side=%d\n", msk, READ_ONCE(msk->token), server_side); WRITE_ONCE(pm->server_side, server_side); mptcp_event(MPTCP_EVENT_CREATED, msk, ssk, GFP_ATOMIC); @@ -90,7 +90,7 @@ subflows_max = mptcp_pm_get_subflows_max(msk); - pr_debug("msk=%p subflows=%d max=%d allow=%d", msk, pm->subflows, + pr_debug("msk=%p subflows=%d max=%d allow=%d\n", msk, pm->subflows, subflows_max, READ_ONCE(pm->accept_subflow)); /* try to avoid acquiring the lock below */ @@ -114,7 +114,7 @@ static bool mptcp_pm_schedule_work(struct mptcp_sock *msk, enum mptcp_pm_status new_status) { - pr_debug("msk=%p status=%x new=%lx", msk, msk->pm.status, + pr_debug("msk=%p status=%x new=%lx\n", msk, msk->pm.status, BIT(new_status)); if (msk->pm.status & BIT(new_status)) return false; @@ -129,7 +129,7 @@ struct mptcp_pm_data *pm = &msk->pm; bool announce = false; - pr_debug("msk=%p", msk); + pr_debug("msk=%p\n", msk); spin_lock_bh(&pm->lock); @@ -153,14 +153,14 @@ void mptcp_pm_connection_closed(struct mptcp_sock *msk) { - pr_debug("msk=%p", msk); + pr_debug("msk=%p\n", msk); } void mptcp_pm_subflow_established(struct mptcp_sock *msk) { struct mptcp_pm_data *pm = &msk->pm; - pr_debug("msk=%p", msk); + pr_debug("msk=%p\n", msk); if (!READ_ONCE(pm->work_pending)) return; @@ -212,7 +212,7 @@ struct mptcp_sock *msk = mptcp_sk(subflow->conn); struct mptcp_pm_data *pm = &msk->pm; - pr_debug("msk=%p remote_id=%d accept=%d", msk, addr->id, + pr_debug("msk=%p remote_id=%d accept=%d\n", msk, addr->id, READ_ONCE(pm->accept_addr)); mptcp_event_addr_announced(ssk, addr); @@ -245,7 +245,7 @@ { struct mptcp_pm_data *pm = &msk->pm; - pr_debug("msk=%p", msk); + pr_debug("msk=%p\n", msk); spin_lock_bh(&pm->lock); @@ -269,7 +269,7 @@ struct mptcp_pm_data *pm = &msk->pm; u8 i; - pr_debug("msk=%p remote_ids_nr=%d", msk, rm_list->nr); + pr_debug("msk=%p remote_ids_nr=%d\n", msk, rm_list->nr); for (i = 0; i < rm_list->nr; i++) mptcp_event_addr_removed(msk, rm_list->idsi); @@ -301,19 +301,19 @@ struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); struct mptcp_sock *msk = mptcp_sk(subflow->conn); - pr_debug("fail_seq=%llu", fail_seq); + pr_debug("fail_seq=%llu\n", fail_seq); if (!READ_ONCE(msk->allow_infinite_fallback)) return; if (!subflow->fail_tout) { - pr_debug("send MP_FAIL response and infinite map"); + pr_debug("send MP_FAIL response and infinite map\n"); subflow->send_mp_fail = 1; subflow->send_infinite_map = 1; tcp_send_ack(sk); } else { - pr_debug("MP_FAIL response received"); + pr_debug("MP_FAIL response received\n"); WRITE_ONCE(subflow->fail_tout, 0); } }
View file
_service:recompress:tar_scm:kernel.tar.gz/net/mptcp/pm_netlink.c
Changed
@@ -289,7 +289,7 @@ struct mptcp_sock *msk = entry->sock; struct sock *sk = (struct sock *)msk; - pr_debug("msk=%p", msk); + pr_debug("msk=%p\n", msk); if (!msk) return; @@ -308,7 +308,7 @@ spin_lock_bh(&msk->pm.lock); if (!mptcp_pm_should_add_signal_addr(msk)) { - pr_debug("retransmit ADD_ADDR id=%d", entry->addr.id); + pr_debug("retransmit ADD_ADDR id=%d\n", entry->addr.id); mptcp_pm_announce_addr(msk, &entry->addr, false); mptcp_pm_add_addr_send_ack(msk); entry->retrans_times++; @@ -395,7 +395,7 @@ struct sock *sk = (struct sock *)msk; LIST_HEAD(free_list); - pr_debug("msk=%p", msk); + pr_debug("msk=%p\n", msk); spin_lock_bh(&msk->pm.lock); list_splice_init(&msk->pm.anno_list, &free_list); @@ -481,7 +481,7 @@ struct sock *ssk = mptcp_subflow_tcp_sock(subflow); bool slow; - pr_debug("send ack for %s", + pr_debug("send ack for %s\n", prio ? "mp_prio" : (mptcp_pm_should_add_signal(msk) ? "add_addr" : "rm_addr")); slow = lock_sock_fast(ssk); @@ -727,7 +727,7 @@ add_addr_accept_max = mptcp_pm_get_add_addr_accept_max(msk); subflows_max = mptcp_pm_get_subflows_max(msk); - pr_debug("accepted %d:%d remote family %d", + pr_debug("accepted %d:%d remote family %d\n", msk->pm.add_addr_accepted, add_addr_accept_max, msk->pm.remote.family); @@ -800,7 +800,7 @@ { struct mptcp_subflow_context *subflow; - pr_debug("bkup=%d", bkup); + pr_debug("bkup=%d\n", bkup); mptcp_for_each_subflow(msk, subflow) { struct sock *ssk = mptcp_subflow_tcp_sock(subflow); @@ -831,7 +831,7 @@ struct sock *sk = (struct sock *)msk; u8 i; - pr_debug("%s rm_list_nr %d", + pr_debug("%s rm_list_nr %d\n", rm_type == MPTCP_MIB_RMADDR ? "address" : "subflow", rm_list->nr); msk_owned_by_me(msk); @@ -863,7 +863,7 @@ if (rm_type == MPTCP_MIB_RMSUBFLOW && id != rm_id) continue; - pr_debug(" -> %s rm_list_ids%d=%u local_id=%u remote_id=%u mpc_id=%u", + pr_debug(" -> %s rm_list_ids%d=%u local_id=%u remote_id=%u mpc_id=%u\n", rm_type == MPTCP_MIB_RMADDR ? "address" : "subflow", i, rm_id, id, remote_id, msk->mpc_endpoint_id); spin_unlock_bh(&msk->pm.lock); @@ -920,7 +920,7 @@ spin_lock_bh(&msk->pm.lock); - pr_debug("msk=%p status=%x", msk, pm->status); + pr_debug("msk=%p status=%x\n", msk, pm->status); if (pm->status & BIT(MPTCP_PM_ADD_ADDR_RECEIVED)) { pm->status &= ~BIT(MPTCP_PM_ADD_ADDR_RECEIVED); mptcp_pm_nl_add_addr_received(msk); @@ -1502,7 +1502,7 @@ long s_slot = 0, s_num = 0; struct mptcp_sock *msk; - pr_debug("remove_id=%d", addr->id); + pr_debug("remove_id=%d\n", addr->id); while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) { struct sock *sk = (struct sock *)msk;
View file
_service:recompress:tar_scm:kernel.tar.gz/net/mptcp/protocol.c
Changed
@@ -138,7 +138,7 @@ !skb_try_coalesce(to, from, &fragstolen, &delta)) return false; - pr_debug("colesced seq %llx into %llx new len %d new end seq %llx", + pr_debug("colesced seq %llx into %llx new len %d new end seq %llx\n", MPTCP_SKB_CB(from)->map_seq, MPTCP_SKB_CB(to)->map_seq, to->len, MPTCP_SKB_CB(from)->end_seq); MPTCP_SKB_CB(to)->end_seq = MPTCP_SKB_CB(from)->end_seq; @@ -216,7 +216,7 @@ end_seq = MPTCP_SKB_CB(skb)->end_seq; max_seq = atomic64_read(&msk->rcv_wnd_sent); - pr_debug("msk=%p seq=%llx limit=%llx empty=%d", msk, seq, max_seq, + pr_debug("msk=%p seq=%llx limit=%llx empty=%d\n", msk, seq, max_seq, RB_EMPTY_ROOT(&msk->out_of_order_queue)); if (after64(end_seq, max_seq)) { /* out of window */ @@ -654,7 +654,7 @@ } } - pr_debug("msk=%p ssk=%p", msk, ssk); + pr_debug("msk=%p ssk=%p\n", msk, ssk); tp = tcp_sk(ssk); do { u32 map_remaining, offset; @@ -739,7 +739,7 @@ u64 end_seq; p = rb_first(&msk->out_of_order_queue); - pr_debug("msk=%p empty=%d", msk, RB_EMPTY_ROOT(&msk->out_of_order_queue)); + pr_debug("msk=%p empty=%d\n", msk, RB_EMPTY_ROOT(&msk->out_of_order_queue)); while (p) { skb = rb_to_skb(p); if (after64(MPTCP_SKB_CB(skb)->map_seq, msk->ack_seq)) @@ -761,7 +761,7 @@ int delta = msk->ack_seq - MPTCP_SKB_CB(skb)->map_seq; /* skip overlapping data, if any */ - pr_debug("uncoalesced seq=%llx ack seq=%llx delta=%d", + pr_debug("uncoalesced seq=%llx ack seq=%llx delta=%d\n", MPTCP_SKB_CB(skb)->map_seq, msk->ack_seq, delta); MPTCP_SKB_CB(skb)->offset += delta; @@ -1255,7 +1255,7 @@ size_t copy; int i; - pr_debug("msk=%p ssk=%p sending dfrag at seq=%llu len=%u already sent=%u", + pr_debug("msk=%p ssk=%p sending dfrag at seq=%llu len=%u already sent=%u\n", msk, ssk, dfrag->data_seq, dfrag->data_len, info->sent); if (WARN_ON_ONCE(info->sent > info->limit || @@ -1356,7 +1356,7 @@ mpext->use_map = 1; mpext->dsn64 = 1; - pr_debug("data_seq=%llu subflow_seq=%u data_len=%u dsn64=%d", + pr_debug("data_seq=%llu subflow_seq=%u data_len=%u dsn64=%d\n", mpext->data_seq, mpext->subflow_seq, mpext->data_len, mpext->dsn64); @@ -1905,7 +1905,7 @@ if (!msk->first_pending) WRITE_ONCE(msk->first_pending, dfrag); } - pr_debug("msk=%p dfrag at seq=%llu len=%u sent=%u new=%d", msk, + pr_debug("msk=%p dfrag at seq=%llu len=%u sent=%u new=%d\n", msk, dfrag->data_seq, dfrag->data_len, dfrag->already_sent, !dfrag_collapsed); @@ -2261,7 +2261,7 @@ } } - pr_debug("block timeout %ld", timeo); + pr_debug("block timeout %ld\n", timeo); sk_wait_data(sk, &timeo, NULL); } @@ -2277,7 +2277,7 @@ } } - pr_debug("msk=%p rx queue empty=%d:%d copied=%d", + pr_debug("msk=%p rx queue empty=%d:%d copied=%d\n", msk, skb_queue_empty_lockless(&sk->sk_receive_queue), skb_queue_empty(&msk->receive_queue), copied); if (!(flags & MSG_PEEK)) @@ -2521,6 +2521,12 @@ void mptcp_close_ssk(struct sock *sk, struct sock *ssk, struct mptcp_subflow_context *subflow) { + /* The first subflow can already be closed and still in the list */ + if (subflow->close_event_done) + return; + + subflow->close_event_done = true; + if (sk->sk_state == TCP_ESTABLISHED) mptcp_event(MPTCP_EVENT_SUB_CLOSED, mptcp_sk(sk), ssk, GFP_KERNEL); @@ -2730,7 +2736,7 @@ if (!ssk) return; - pr_debug("MP_FAIL doesn't respond, reset the subflow"); + pr_debug("MP_FAIL doesn't respond, reset the subflow\n"); slow = lock_sock_fast(ssk); mptcp_subflow_reset(ssk); @@ -2902,7 +2908,7 @@ break; default: if (__mptcp_check_fallback(mptcp_sk(sk))) { - pr_debug("Fallback"); + pr_debug("Fallback\n"); ssk->sk_shutdown |= how; tcp_shutdown(ssk, how); @@ -2912,7 +2918,7 @@ WRITE_ONCE(mptcp_sk(sk)->snd_una, mptcp_sk(sk)->snd_nxt); mptcp_schedule_work(sk); } else { - pr_debug("Sending DATA_FIN on subflow %p", ssk); + pr_debug("Sending DATA_FIN on subflow %p\n", ssk); tcp_send_ack(ssk); if (!mptcp_rtx_timer_pending(sk)) mptcp_reset_rtx_timer(sk); @@ -2978,7 +2984,7 @@ struct mptcp_subflow_context *subflow; struct mptcp_sock *msk = mptcp_sk(sk); - pr_debug("msk=%p snd_data_fin_enable=%d pending=%d snd_nxt=%llu write_seq=%llu", + pr_debug("msk=%p snd_data_fin_enable=%d pending=%d snd_nxt=%llu write_seq=%llu\n", msk, msk->snd_data_fin_enable, !!mptcp_send_head(sk), msk->snd_nxt, msk->write_seq); @@ -3002,7 +3008,7 @@ { struct mptcp_sock *msk = mptcp_sk(sk); - pr_debug("msk=%p snd_data_fin_enable=%d shutdown=%x state=%d pending=%d", + pr_debug("msk=%p snd_data_fin_enable=%d shutdown=%x state=%d pending=%d\n", msk, msk->snd_data_fin_enable, sk->sk_shutdown, sk->sk_state, !!mptcp_send_head(sk)); @@ -3017,7 +3023,7 @@ { struct mptcp_sock *msk = mptcp_sk(sk); - pr_debug("msk=%p", msk); + pr_debug("msk=%p\n", msk); might_sleep(); @@ -3125,7 +3131,7 @@ mptcp_set_state(sk, TCP_CLOSE); sock_hold(sk); - pr_debug("msk=%p state=%d", sk, sk->sk_state); + pr_debug("msk=%p state=%d\n", sk, sk->sk_state); if (msk->token) mptcp_event(MPTCP_EVENT_CLOSED, msk, NULL, GFP_KERNEL); @@ -3557,7 +3563,7 @@ { struct mptcp_sock *msk = mptcp_sk(sk); - pr_debug("msk=%p, ssk=%p", msk, msk->first); + pr_debug("msk=%p, ssk=%p\n", msk, msk->first); if (WARN_ON_ONCE(!msk->first)) return -EINVAL; @@ -3574,7 +3580,7 @@ sk = subflow->conn; msk = mptcp_sk(sk); - pr_debug("msk=%p, token=%u", sk, subflow->token); + pr_debug("msk=%p, token=%u\n", sk, subflow->token); subflow->map_seq = subflow->iasn; subflow->map_subflow_seq = 1; @@ -3603,7 +3609,7 @@ struct sock *parent = (void *)msk; bool ret = true; - pr_debug("msk=%p, subflow=%p", msk, subflow); + pr_debug("msk=%p, subflow=%p\n", msk, subflow); /* mptcp socket already closing? */ if (!mptcp_is_fully_established(parent)) { @@ -3649,7 +3655,7 @@ static void mptcp_shutdown(struct sock *sk, int how) { - pr_debug("sk=%p, how=%d", sk, how); + pr_debug("sk=%p, how=%d\n", sk, how);
View file
_service:recompress:tar_scm:kernel.tar.gz/net/mptcp/protocol.h
Changed
@@ -517,7 +517,8 @@ stale : 1, /* unable to snd/rcv data, do not use for xmit */ valid_csum_seen : 1, /* at least one csum validated */ is_mptfo : 1, /* subflow is doing TFO */ - __unused : 10; + close_event_done : 1, /* has done the post-closed part */ + __unused : 9; bool data_avail; bool scheduled; bool pm_listener; /* a listener managed by the kernel PM? */ @@ -1132,7 +1133,7 @@ static inline void __mptcp_do_fallback(struct mptcp_sock *msk) { if (__mptcp_check_fallback(msk)) { - pr_debug("TCP fallback already done (msk=%p)", msk); + pr_debug("TCP fallback already done (msk=%p)\n", msk); return; } set_bit(MPTCP_FALLBACK_DONE, &msk->flags); @@ -1168,7 +1169,7 @@ } } -#define pr_fallback(a) pr_debug("%s:fallback to TCP (msk=%p)", __func__, a) +#define pr_fallback(a) pr_debug("%s:fallback to TCP (msk=%p)\n", __func__, a) static inline bool mptcp_check_infinite_map(struct sk_buff *skb) {
View file
_service:recompress:tar_scm:kernel.tar.gz/net/mptcp/sched.c
Changed
@@ -64,7 +64,7 @@ list_add_tail_rcu(&sched->list, &mptcp_sched_list); spin_unlock(&mptcp_sched_list_lock); - pr_debug("%s registered", sched->name); + pr_debug("%s registered\n", sched->name); return 0; } @@ -96,7 +96,7 @@ if (msk->sched->init) msk->sched->init(msk); - pr_debug("sched=%s", msk->sched->name); + pr_debug("sched=%s\n", msk->sched->name); return 0; }
View file
_service:recompress:tar_scm:kernel.tar.gz/net/mptcp/sockopt.c
Changed
@@ -879,7 +879,7 @@ struct mptcp_sock *msk = mptcp_sk(sk); struct sock *ssk; - pr_debug("msk=%p", msk); + pr_debug("msk=%p\n", msk); if (level == SOL_SOCKET) return mptcp_setsockopt_sol_socket(msk, optname, optval, optlen); @@ -1448,7 +1448,7 @@ struct mptcp_sock *msk = mptcp_sk(sk); struct sock *ssk; - pr_debug("msk=%p", msk); + pr_debug("msk=%p\n", msk); /* @@ the meaning of setsockopt() when the socket is connected and * there are multiple subflows is not yet defined. It is up to the
View file
_service:recompress:tar_scm:kernel.tar.gz/net/mptcp/subflow.c
Changed
@@ -38,7 +38,7 @@ { struct mptcp_subflow_request_sock *subflow_req = mptcp_subflow_rsk(req); - pr_debug("subflow_req=%p", subflow_req); + pr_debug("subflow_req=%p\n", subflow_req); if (subflow_req->msk) sock_put((struct sock *)subflow_req->msk); @@ -151,7 +151,7 @@ struct mptcp_options_received mp_opt; bool opt_mp_capable, opt_mp_join; - pr_debug("subflow_req=%p, listener=%p", subflow_req, listener); + pr_debug("subflow_req=%p, listener=%p\n", subflow_req, listener); #ifdef CONFIG_TCP_MD5SIG /* no MPTCP if MD5SIG is enabled on this socket or we may run out of @@ -228,7 +228,7 @@ } if (subflow_use_different_sport(subflow_req->msk, sk_listener)) { - pr_debug("syn inet_sport=%d %d", + pr_debug("syn inet_sport=%d %d\n", ntohs(inet_sk(sk_listener)->inet_sport), ntohs(inet_sk((struct sock *)subflow_req->msk)->inet_sport)); if (!mptcp_pm_sport_in_anno_list(subflow_req->msk, sk_listener)) { @@ -247,7 +247,7 @@ return -EPERM; } - pr_debug("token=%u, remote_nonce=%u msk=%p", subflow_req->token, + pr_debug("token=%u, remote_nonce=%u msk=%p\n", subflow_req->token, subflow_req->remote_nonce, subflow_req->msk); } @@ -517,7 +517,7 @@ subflow->rel_write_seq = 1; subflow->conn_finished = 1; subflow->ssn_offset = TCP_SKB_CB(skb)->seq; - pr_debug("subflow=%p synack seq=%x", subflow, subflow->ssn_offset); + pr_debug("subflow=%p synack seq=%x\n", subflow, subflow->ssn_offset); mptcp_get_options(skb, &mp_opt); if (subflow->request_mptcp) { @@ -549,7 +549,7 @@ subflow->thmac = mp_opt.thmac; subflow->remote_nonce = mp_opt.nonce; WRITE_ONCE(subflow->remote_id, mp_opt.join_id); - pr_debug("subflow=%p, thmac=%llu, remote_nonce=%u backup=%d", + pr_debug("subflow=%p, thmac=%llu, remote_nonce=%u backup=%d\n", subflow, subflow->thmac, subflow->remote_nonce, subflow->backup); @@ -575,7 +575,7 @@ MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_JOINSYNACKBACKUPRX); if (subflow_use_different_dport(msk, sk)) { - pr_debug("synack inet_dport=%d %d", + pr_debug("synack inet_dport=%d %d\n", ntohs(inet_sk(sk)->inet_dport), ntohs(inet_sk(parent)->inet_dport)); MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_JOINPORTSYNACKRX); @@ -643,7 +643,7 @@ { struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); - pr_debug("subflow=%p", subflow); + pr_debug("subflow=%p\n", subflow); /* Never answer to SYNs sent to broadcast or multicast */ if (skb_rtable(skb)->rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST)) @@ -674,7 +674,7 @@ { struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); - pr_debug("subflow=%p", subflow); + pr_debug("subflow=%p\n", subflow); if (skb->protocol == htons(ETH_P_IP)) return subflow_v4_conn_request(sk, skb); @@ -794,7 +794,7 @@ struct mptcp_sock *owner; struct sock *child; - pr_debug("listener=%p, req=%p, conn=%p", listener, req, listener->conn); + pr_debug("listener=%p, req=%p, conn=%p\n", listener, req, listener->conn); /* After child creation we must look for MPC even when options * are not parsed @@ -885,7 +885,7 @@ ctx->conn = (struct sock *)owner; if (subflow_use_different_sport(owner, sk)) { - pr_debug("ack inet_sport=%d %d", + pr_debug("ack inet_sport=%d %d\n", ntohs(inet_sk(sk)->inet_sport), ntohs(inet_sk((struct sock *)owner)->inet_sport)); if (!mptcp_pm_sport_in_anno_list(owner, sk)) { @@ -942,7 +942,7 @@ static void dbg_bad_map(struct mptcp_subflow_context *subflow, u32 ssn) { - pr_debug("Bad mapping: ssn=%d map_seq=%d map_data_len=%d", + pr_debug("Bad mapping: ssn=%d map_seq=%d map_data_len=%d\n", ssn, subflow->map_subflow_seq, subflow->map_data_len); } @@ -1104,7 +1104,7 @@ data_len = mpext->data_len; if (data_len == 0) { - pr_debug("infinite mapping received"); + pr_debug("infinite mapping received\n"); MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_INFINITEMAPRX); subflow->map_data_len = 0; return MAPPING_INVALID; @@ -1114,7 +1114,7 @@ if (data_len == 1) { bool updated = mptcp_update_rcv_data_fin(msk, mpext->data_seq, mpext->dsn64); - pr_debug("DATA_FIN with no payload seq=%llu", mpext->data_seq); + pr_debug("DATA_FIN with no payload seq=%llu\n", mpext->data_seq); if (subflow->map_valid) { /* A DATA_FIN might arrive in a DSS * option before the previous mapping @@ -1139,7 +1139,7 @@ data_fin_seq &= GENMASK_ULL(31, 0); mptcp_update_rcv_data_fin(msk, data_fin_seq, mpext->dsn64); - pr_debug("DATA_FIN with mapping seq=%llu dsn64=%d", + pr_debug("DATA_FIN with mapping seq=%llu dsn64=%d\n", data_fin_seq, mpext->dsn64); } @@ -1186,7 +1186,7 @@ if (unlikely(subflow->map_csum_reqd != csum_reqd)) return MAPPING_INVALID; - pr_debug("new map seq=%llu subflow_seq=%u data_len=%u csum=%d:%u", + pr_debug("new map seq=%llu subflow_seq=%u data_len=%u csum=%d:%u\n", subflow->map_seq, subflow->map_subflow_seq, subflow->map_data_len, subflow->map_csum_reqd, subflow->map_data_csum); @@ -1221,7 +1221,7 @@ avail_len = skb->len - offset; incr = limit >= avail_len ? avail_len + fin : limit; - pr_debug("discarding=%d len=%d offset=%d seq=%d", incr, skb->len, + pr_debug("discarding=%d len=%d offset=%d seq=%d\n", incr, skb->len, offset, subflow->map_subflow_seq); MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_DUPDATA); tcp_sk(ssk)->copied_seq += incr; @@ -1322,7 +1322,7 @@ old_ack = READ_ONCE(msk->ack_seq); ack_seq = mptcp_subflow_get_mapped_dsn(subflow); - pr_debug("msk ack_seq=%llx subflow ack_seq=%llx", old_ack, + pr_debug("msk ack_seq=%llx subflow ack_seq=%llx\n", old_ack, ack_seq); if (unlikely(before64(ack_seq, old_ack))) { mptcp_subflow_discard_data(ssk, skb, old_ack - ack_seq); @@ -1394,7 +1394,7 @@ subflow->map_valid = 0; WRITE_ONCE(subflow->data_avail, false); - pr_debug("Done with mapping: seq=%u data_len=%u", + pr_debug("Done with mapping: seq=%u data_len=%u\n", subflow->map_subflow_seq, subflow->map_data_len); } @@ -1504,7 +1504,7 @@ target = mapped ? &subflow_v6m_specific : subflow_default_af_ops(sk); - pr_debug("subflow=%p family=%d ops=%p target=%p mapped=%d", + pr_debug("subflow=%p family=%d ops=%p target=%p mapped=%d\n", subflow, sk->sk_family, icsk->icsk_af_ops, target, mapped); if (likely(icsk->icsk_af_ops == target)) @@ -1597,7 +1597,7 @@ goto failed; mptcp_crypto_key_sha(subflow->remote_key, &remote_token, NULL); - pr_debug("msk=%p remote_token=%u local_id=%d remote_id=%d", msk, + pr_debug("msk=%p remote_token=%u local_id=%d remote_id=%d\n", msk, remote_token, local_id, remote_id); subflow->remote_token = remote_token; WRITE_ONCE(subflow->remote_id, remote_id); @@ -1732,7 +1732,7 @@ SOCK_INODE(sf)->i_gid = SOCK_INODE(sk->sk_socket)->i_gid; subflow = mptcp_subflow_ctx(sf->sk); - pr_debug("subflow=%p", subflow); + pr_debug("subflow=%p\n", subflow); *new_sock = sf; sock_hold(sk); @@ -1761,7 +1761,7 @@ INIT_LIST_HEAD(&ctx->node);
View file
_service:recompress:tar_scm:kernel.tar.gz/security/integrity/ima/ima_template_lib.c
Changed
@@ -318,15 +318,21 @@ hash_algo_namehash_algo); } - if (digest) + if (digest) { memcpy(buffer + offset, digest, digestsize); - else + } else { /* * If digest is NULL, the event being recorded is a violation. * Make room for the digest by increasing the offset by the - * hash algorithm digest size. + * hash algorithm digest size. If the hash algorithm is not + * specified increase the offset by IMA_DIGEST_SIZE which + * fits SHA1 or MD5 */ - offset += hash_digest_sizehash_algo; + if (hash_algo < HASH_ALGO__LAST) + offset += hash_digest_sizehash_algo; + else + offset += IMA_DIGEST_SIZE; + } return ima_write_template_field_data(buffer, offset + digestsize, fmt, field_data);
View file
_service:recompress:tar_scm:kernel.tar.gz/tools/perf/tests/builtin-test.c
Changed
@@ -139,6 +139,7 @@ &workload__sqrtloop, &workload__brstack, &workload__datasym, + &workload__traploop }; static int num_subtests(const struct test_suite *t)
View file
_service:recompress:tar_scm:kernel.tar.gz/tools/perf/tests/shell/test_brstack.sh
Changed
@@ -12,7 +12,6 @@ fi TMPDIR=$(mktemp -d /tmp/__perf_test.program.XXXXX) -TESTPROG="perf test -w brstack" cleanup() { rm -rf $TMPDIR @@ -20,11 +19,21 @@ trap cleanup EXIT TERM INT +is_arm64() { + uname -m | grep -q aarch64 +} + +if is_arm64; then + TESTPROG="perf test -w brstack 5000" +else + TESTPROG="perf test -w brstack" +fi + test_user_branches() { echo "Testing user branch stack sampling" perf record -o $TMPDIR/perf.data --branch-filter any,save_type,u -- ${TESTPROG} > /dev/null 2>&1 - perf script -i $TMPDIR/perf.data --fields brstacksym | xargs -n1 > $TMPDIR/perf.script + perf script -i $TMPDIR/perf.data --fields brstacksym | tr ' ' '\n' > $TMPDIR/perf.script # example of branch entries: # brstack_foo+0x14/brstack_bar+0x40/P/-/-/0/CALL @@ -38,12 +47,43 @@ grep -E -m1 "^brstack_foo\+^ */brstack_bench\+^ */RET/.*$" $TMPDIR/perf.script grep -E -m1 "^brstack_bench\+^ */brstack_bench\+^ */COND/.*$" $TMPDIR/perf.script grep -E -m1 "^brstack\+^ */brstack\+^ */UNCOND/.*$" $TMPDIR/perf.script + + if is_arm64; then + # in arm64 with BRBE, we get IRQ entries that correspond + # to any point in the process + grep -m1 "/IRQ/" $TMPDIR/perf.script + fi set +x # some branch types are still not being tested: # IND COND_CALL COND_RET SYSCALL SYSRET IRQ SERROR NO_TX } +test_arm64_trap_eret_branches() { + echo "Testing trap & eret branches (arm64 brbe)" + perf record -o $TMPDIR/perf.data --branch-filter any,save_type,u -- \ + perf test -w traploop 250 + perf script -i $TMPDIR/perf.data --fields brstacksym | tr ' ' '\n' > $TMPDIR/perf.script + set -x + # BRBINF<n>.TYPE == TRAP are mapped to PERF_BR_SYSCALL by the BRBE driver + grep -E -m1 "^trap_bench\+^ */\unknown\^ */SYSCALL/" $TMPDIR/perf.script + grep -E -m1 "^\unknown\^ */trap_bench\+^ */ERET/" $TMPDIR/perf.script + set +x +} + +test_arm64_kernel_branches() { + echo "Testing kernel branches (arm64 brbe)" + # skip if perf doesn't have enough privileges + if ! perf record --branch-filter any,k -o- -- true > /dev/null; then + echo "skipped: not enough privileges" + return 0 + fi + perf record -o $TMPDIR/perf.data --branch-filter any,k -- uname -a + perf script -i $TMPDIR/perf.data --fields brstack | tr ' ' '\n' > $TMPDIR/perf.script + grep -E -m1 "0xffff0-9a-f{12}" $TMPDIR/perf.script + ! egrep -E -m1 "0x00000-9a-f{12}" $TMPDIR/perf.script +} + # first argument <arg0> is the argument passed to "--branch-stack <arg0>,save_type,u" # second argument are the expected branch types for the given filter test_filter() { @@ -53,7 +93,7 @@ echo "Testing branch stack filtering permutation ($test_filter_filter,$test_filter_expect)" perf record -o $TMPDIR/perf.data --branch-filter $test_filter_filter,save_type,u -- ${TESTPROG} > /dev/null 2>&1 - perf script -i $TMPDIR/perf.data --fields brstack | xargs -n1 > $TMPDIR/perf.script + perf script -i $TMPDIR/perf.data --fields brstack | tr ' ' '\n' | sed '/^:space:*$/d' > $TMPDIR/perf.script # fail if we find any branch type that doesn't match any of the expected ones # also consider UNKNOWN branch types (-) @@ -66,11 +106,16 @@ test_user_branches -test_filter "any_call" "CALL|IND_CALL|COND_CALL|SYSCALL|IRQ" +if is_arm64; then + test_arm64_trap_eret_branches + test_arm64_kernel_branches +fi + +test_filter "any_call" "CALL|IND_CALL|COND_CALL|SYSCALL|IRQ|FAULT_DATA|FAULT_INST" test_filter "call" "CALL|SYSCALL" test_filter "cond" "COND" test_filter "any_ret" "RET|COND_RET|SYSRET|ERET" test_filter "call,cond" "CALL|SYSCALL|COND" -test_filter "any_call,cond" "CALL|IND_CALL|COND_CALL|IRQ|SYSCALL|COND" -test_filter "cond,any_call,any_ret" "COND|CALL|IND_CALL|COND_CALL|SYSCALL|IRQ|RET|COND_RET|SYSRET|ERET" +test_filter "any_call,cond" "CALL|IND_CALL|COND_CALL|IRQ|SYSCALL|COND|FAULT_DATA|FAULT_INST" +test_filter "cond,any_call,any_ret" "COND|CALL|IND_CALL|COND_CALL|SYSCALL|IRQ|RET|COND_RET|SYSRET|ERET|FAULT_DATA|FAULT_INST"
View file
_service:recompress:tar_scm:kernel.tar.gz/tools/perf/tests/tests.h
Changed
@@ -205,6 +205,7 @@ DECLARE_WORKLOAD(sqrtloop); DECLARE_WORKLOAD(brstack); DECLARE_WORKLOAD(datasym); +DECLARE_WORKLOAD(traploop); extern const char *dso_to_test;
View file
_service:recompress:tar_scm:kernel.tar.gz/tools/perf/tests/workloads/Build
Changed
@@ -6,8 +6,10 @@ perf-y += sqrtloop.o perf-y += brstack.o perf-y += datasym.o +perf-y += traploop.o CFLAGS_sqrtloop.o = -g -O0 -fno-inline -U_FORTIFY_SOURCE CFLAGS_leafloop.o = -g -O0 -fno-inline -fno-omit-frame-pointer -U_FORTIFY_SOURCE CFLAGS_brstack.o = -g -O0 -fno-inline -U_FORTIFY_SOURCE CFLAGS_datasym.o = -g -O0 -fno-inline -U_FORTIFY_SOURCE +CFLAGS_traploop.o = -g -O0 -fno-inline -U_FORTIFY_SOURCE
View file
_service:recompress:tar_scm:kernel.tar.gz/tools/perf/tests/workloads/traploop.c
Added
@@ -0,0 +1,39 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <stdlib.h> +#include "../tests.h" + +#define BENCH_RUNS 999999 + +static volatile int cnt; + +#ifdef __aarch64__ +static void trap_bench(void) +{ + unsigned long val; + + asm("mrs %0, ID_AA64ISAR0_EL1" : "=r" (val)); /* TRAP + ERET */ +} +#else +static void trap_bench(void) +{ + +} +#endif + +static int traploop(int argc, const char **argv) +{ + int num_loops = BENCH_RUNS; + + if (argc > 0) + num_loops = atoi(argv0); + + while (1) { + if ((cnt++) > num_loops) + break; + + trap_bench(); + } + return 0; +} + +DEFINE_WORKLOAD(traploop);
View file
_service:recompress:tar_scm:kernel.tar.gz/tools/testing/selftests/net/mptcp/mptcp_join.sh
Changed
@@ -419,12 +419,17 @@ fi } +start_events() +{ + mptcp_lib_events "${ns1}" "${evts_ns1}" evts_ns1_pid + mptcp_lib_events "${ns2}" "${evts_ns2}" evts_ns2_pid +} + reset_with_events() { reset "${1}" || return 1 - mptcp_lib_events "${ns1}" "${evts_ns1}" evts_ns1_pid - mptcp_lib_events "${ns2}" "${evts_ns2}" evts_ns2_pid + start_events } reset_with_tcp_filter() @@ -3438,6 +3443,36 @@ fi } +# $1: ns ; $2: event type ; $3: count +chk_evt_nr() +{ + local ns=${1} + local evt_name="${2}" + local exp="${3}" + + local evts="${evts_ns1}" + local evt="${!evt_name}" + local count + + evt_name="${evt_name:16}" # without MPTCP_LIB_EVENT_ + "${ns}" == "ns2" && evts="${evts_ns2}" + + print_check "event ${ns} ${evt_name} (${exp})" + + if "${evt_name}" = "LISTENER_"* && + ! mptcp_lib_kallsyms_has "mptcp_event_pm_listener$"; then + print_skip "event not supported" + return + fi + + count=$(grep -cw "type:${evt}" "${evts}") + if "${count}" != "${exp}" ; then + fail_test "got ${count} events, expected ${exp}" + else + print_ok + fi +} + userspace_tests() { # userspace pm type prevents add_addr @@ -3677,6 +3712,7 @@ if reset_with_tcp_filter "delete and re-add" ns2 10.0.3.2 REJECT OUTPUT && mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then + start_events pm_nl_set_limits $ns1 0 3 pm_nl_set_limits $ns2 0 3 pm_nl_add_endpoint $ns2 10.0.1.2 id 1 dev ns2eth1 flags subflow @@ -3728,12 +3764,28 @@ mptcp_lib_kill_wait $tests_pid + kill_events_pids + chk_evt_nr ns1 MPTCP_LIB_EVENT_LISTENER_CREATED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_CREATED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_ESTABLISHED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_ANNOUNCED 0 + chk_evt_nr ns1 MPTCP_LIB_EVENT_REMOVED 4 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_ESTABLISHED 6 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_CLOSED 4 + + chk_evt_nr ns2 MPTCP_LIB_EVENT_CREATED 1 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ESTABLISHED 1 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ANNOUNCED 0 + chk_evt_nr ns2 MPTCP_LIB_EVENT_REMOVED 0 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_ESTABLISHED 6 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_CLOSED 5 # one has been closed before estab + chk_join_nr 6 6 6 chk_rm_nr 4 4 fi # remove and re-add - if reset "delete re-add signal" && + if reset_with_events "delete re-add signal" && mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then pm_nl_set_limits $ns1 0 3 pm_nl_set_limits $ns2 3 3 @@ -3741,7 +3793,7 @@ # broadcast IP: no packet for this address will be received on ns1 pm_nl_add_endpoint $ns1 224.0.0.1 id 2 flags signal pm_nl_add_endpoint $ns1 10.0.1.1 id 42 flags signal - test_linkfail=4 speed=20 \ + test_linkfail=4 speed=5 \ run_tests $ns1 $ns2 10.0.1.1 & local tests_pid=$! @@ -3770,13 +3822,39 @@ pm_nl_add_endpoint $ns1 10.0.1.1 id 99 flags signal wait_mpj $ns2 - chk_subflow_nr "after re-add" 3 + chk_subflow_nr "after re-add ID 0" 3 + chk_mptcp_info subflows 3 subflows 3 + + pm_nl_del_endpoint $ns1 99 10.0.1.1 + sleep 0.5 + chk_subflow_nr "after re-delete ID 0" 2 + chk_mptcp_info subflows 2 subflows 2 + + pm_nl_add_endpoint $ns1 10.0.1.1 id 88 flags signal + wait_mpj $ns2 + chk_subflow_nr "after re-re-add ID 0" 3 chk_mptcp_info subflows 3 subflows 3 mptcp_lib_kill_wait $tests_pid - chk_join_nr 4 4 4 - chk_add_nr 5 5 - chk_rm_nr 3 2 invert + kill_events_pids + chk_evt_nr ns1 MPTCP_LIB_EVENT_LISTENER_CREATED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_CREATED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_ESTABLISHED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_ANNOUNCED 0 + chk_evt_nr ns1 MPTCP_LIB_EVENT_REMOVED 0 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_ESTABLISHED 5 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_CLOSED 3 + + chk_evt_nr ns2 MPTCP_LIB_EVENT_CREATED 1 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ESTABLISHED 1 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ANNOUNCED 6 + chk_evt_nr ns2 MPTCP_LIB_EVENT_REMOVED 4 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_ESTABLISHED 5 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_CLOSED 3 + + chk_join_nr 5 5 5 + chk_add_nr 6 6 + chk_rm_nr 4 3 invert fi # flush and re-add
Locations
Projects
Search
Status Monitor
Help
Open Build Service
OBS Manuals
API Documentation
OBS Portal
Reporting a Bug
Contact
Mailing List
Forums
Chat (IRC)
Twitter
Open Build Service (OBS)
is an
openSUSE project
.
浙ICP备2022010568号-2